feat: create v2-prep branch with comprehensive planning
Restructured project for V2 refactor: **Structure Changes:** - Moved all V1 code to orig/ folder (preserved with git mv) - Created docs/planning/ directory - Added orig/README_V1.md explaining V1 preservation **Planning Documents:** - 00_V2_MASTER_PLAN.md: Complete architecture overview - Executive summary of critical V1 issues - High-level component architecture diagrams - 5-phase implementation roadmap - Success metrics and risk mitigation - 07_TASK_BREAKDOWN.md: Atomic task breakdown - 99+ hours of detailed tasks - Every task < 2 hours (atomic) - Clear dependencies and success criteria - Organized by implementation phase **V2 Key Improvements:** - Per-exchange parsers (factory pattern) - Multi-layer strict validation - Multi-index pool cache - Background validation pipeline - Comprehensive observability **Critical Issues Addressed:** - Zero address tokens (strict validation + cache enrichment) - Parsing accuracy (protocol-specific parsers) - No audit trail (background validation channel) - Inefficient lookups (multi-index cache) - Stats disconnection (event-driven metrics) Next Steps: 1. Review planning documents 2. Begin Phase 1: Foundation (P1-001 through P1-010) 3. Implement parsers in Phase 2 4. Build cache system in Phase 3 5. Add validation pipeline in Phase 4 6. Migrate and test in Phase 5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
288
logs/RATE_LIMIT_ANALYSIS_20251109.md
Normal file
288
logs/RATE_LIMIT_ANALYSIS_20251109.md
Normal file
@@ -0,0 +1,288 @@
|
||||
# Rate Limiting Analysis & Recommendations - November 9, 2025
|
||||
|
||||
## Summary
|
||||
|
||||
**Configuration Changes Applied:** ✅ Successfully reduced rate limits
|
||||
**Error Rate Impact:** 🟡 Minimal improvement (5% reduction)
|
||||
**Root Cause:** Bot design incompatible with public RPC endpoints
|
||||
**Recommended Solution:** Use premium RPC endpoint or drastically reduce bot scope
|
||||
|
||||
## Comparison
|
||||
|
||||
### Before Rate Limit Fix
|
||||
- **Container:** mev-bot-dev-master-dev (first instance)
|
||||
- **Runtime:** 7 minutes
|
||||
- **429 Errors:** 2,354 total
|
||||
- **Error Rate:** 5.60 errors/second
|
||||
- **Config:** 5 req/sec, 3 concurrent, burst 10
|
||||
|
||||
### After Rate Limit Fix
|
||||
- **Container:** mev-bot-dev-master-dev (rebuilt)
|
||||
- **Runtime:** 21 minutes
|
||||
- **429 Errors:** 6,717 total
|
||||
- **Error Rate:** 5.33 errors/second (-4.8%)
|
||||
- **Config:** 2 req/sec, 1 concurrent, burst 3
|
||||
|
||||
**Improvement:** 5% reduction in error rate, but still unacceptably high
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### Bot's Request Pattern
|
||||
|
||||
The bot generates massive RPC request volume:
|
||||
|
||||
1. **Block Processing:** ~4-8 blocks/minute
|
||||
- Get block data
|
||||
- Get all transactions
|
||||
- Get transaction receipts
|
||||
- Parse events
|
||||
- **Estimate:** ~20-40 requests/minute
|
||||
|
||||
2. **Pool Discovery:** Per swap event detected
|
||||
- Query Uniswap V3 registry
|
||||
- Query Uniswap V2 factory
|
||||
- Query SushiSwap factory
|
||||
- Query Camelot V3 factory
|
||||
- Query 4 Curve registries
|
||||
- **Estimate:** ~8-12 requests per swap event
|
||||
|
||||
3. **Arbitrage Scanning:** Every few seconds
|
||||
- Creates 270 scan tasks for 45 token pairs
|
||||
- Each task queries multiple pools
|
||||
- Batch fetches pool state data
|
||||
- **Estimate:** 270+ requests per scan cycle
|
||||
|
||||
**Total Request Rate:** 400-600+ requests/minute = **6-10 requests/second**
|
||||
|
||||
### Public Endpoint Limits
|
||||
|
||||
Free public RPC endpoints typically allow:
|
||||
- **arb1.arbitrum.io/rpc:** ~1-2 requests/second
|
||||
- **publicnode.com:** ~1-2 requests/second
|
||||
- **1rpc.io:** ~1-2 requests/second
|
||||
|
||||
**Gap:** Bot needs 6-10 req/sec, endpoint allows 1-2 req/sec = **5x over limit**
|
||||
|
||||
## Why Rate Limiting Didn't Help
|
||||
|
||||
The bot's internal rate limiting (2 req/sec) doesn't match the actual request volume because:
|
||||
|
||||
1. **Multiple concurrent operations:**
|
||||
- Block processor running
|
||||
- Event scanner running
|
||||
- Arbitrage service running
|
||||
- Each has its own RPC client
|
||||
|
||||
2. **Burst requests:**
|
||||
- 270 scan tasks created simultaneously
|
||||
- Even with queuing, bursts hit the endpoint
|
||||
|
||||
3. **Fallback endpoints:**
|
||||
- Also rate-limited
|
||||
- Switching between them doesn't help
|
||||
|
||||
## Current Bot Performance
|
||||
|
||||
Despite rate limiting:
|
||||
|
||||
### ✅ Working Correctly
|
||||
- Block processing: Active
|
||||
- DEX transaction detection: Functional
|
||||
- Swap event parsing: Working
|
||||
- Arbitrage scanning: Running (scan #260+ completed)
|
||||
- Pool blacklisting: Protecting against bad pools
|
||||
- Services: All healthy
|
||||
|
||||
### ❌ Performance Impact
|
||||
- **No arbitrage opportunities detected:** 0 found in 21 minutes
|
||||
- **Pool blacklist growing:** 926 pools blacklisted
|
||||
- **Batch fetch failures:** ~200+ failed fetches
|
||||
- **Scan completion:** Most scans fail due to missing pool data
|
||||
|
||||
## Solutions
|
||||
|
||||
### Option 1: Premium RPC Endpoint (RECOMMENDED)
|
||||
|
||||
**Pros:**
|
||||
- Immediate fix
|
||||
- Full bot functionality
|
||||
- Designed for this use case
|
||||
|
||||
**Premium endpoints with high limits:**
|
||||
```bash
|
||||
# Chainstack (50-100 req/sec on paid plans)
|
||||
ARBITRUM_RPC_ENDPOINT=https://arbitrum-mainnet.core.chainstack.com/YOUR_API_KEY
|
||||
|
||||
# Alchemy (300 req/sec on Growth plan)
|
||||
ARBITRUM_RPC_ENDPOINT=https://arb-mainnet.g.alchemy.com/v2/YOUR_API_KEY
|
||||
|
||||
# Infura (100 req/sec on paid plans)
|
||||
ARBITRUM_RPC_ENDPOINT=https://arbitrum-mainnet.infura.io/v3/YOUR_API_KEY
|
||||
|
||||
# QuickNode (500 req/sec on paid plans)
|
||||
ARBITRUM_RPC_ENDPOINT=https://YOUR_ENDPOINT.arbitrum-mainnet.quiknode.pro/YOUR_TOKEN/
|
||||
```
|
||||
|
||||
**Cost:** $50-200/month depending on provider and tier
|
||||
|
||||
**Implementation:**
|
||||
1. Sign up for premium endpoint
|
||||
2. Update .env with API key
|
||||
3. Restart container
|
||||
4. Monitor - should see 95%+ reduction in 429 errors
|
||||
|
||||
### Option 2: Drastically Reduce Bot Scope
|
||||
|
||||
**Make bot compatible with public endpoints:**
|
||||
|
||||
1. **Disable Curve queries** (save ~4 requests per event):
|
||||
```yaml
|
||||
# Reduce protocol coverage
|
||||
protocols:
|
||||
- uniswap_v3
|
||||
- camelot_v3
|
||||
# Remove: curve, balancer, etc.
|
||||
```
|
||||
|
||||
2. **Reduce arbitrage scan frequency** (save ~100+ requests/minute):
|
||||
```yaml
|
||||
arbitrage:
|
||||
scan_interval: 60 # Scan every 60 seconds instead of every 5
|
||||
max_scan_tasks: 50 # Reduce from 270 to 50
|
||||
```
|
||||
|
||||
3. **Increase cache times** (reduce redundant queries):
|
||||
```yaml
|
||||
uniswap:
|
||||
cache:
|
||||
expiration: 1800 # 30 minutes instead of 10
|
||||
```
|
||||
|
||||
4. **Reduce block processing rate**:
|
||||
```yaml
|
||||
bot:
|
||||
polling_interval: 10 # Process blocks slower
|
||||
max_workers: 1 # Single worker only
|
||||
```
|
||||
|
||||
**Pros:** Free, uses public endpoints
|
||||
**Cons:**
|
||||
- Severely limited functionality
|
||||
- Miss most opportunities
|
||||
- Slow response time
|
||||
- Not competitive
|
||||
|
||||
### Option 3: Run Your Own Arbitrum Node
|
||||
|
||||
**Setup:**
|
||||
- Run full Arbitrum node locally
|
||||
- Unlimited RPC requests
|
||||
- No rate limiting
|
||||
|
||||
**Pros:** No rate limits, no costs
|
||||
**Cons:**
|
||||
- High initial setup complexity
|
||||
- Requires 2+ TB storage
|
||||
- High bandwidth requirements
|
||||
- Ongoing maintenance
|
||||
|
||||
**Cost:** ~$100-200/month in server costs
|
||||
|
||||
### Option 4: Hybrid Approach
|
||||
|
||||
**Use both public and premium:**
|
||||
|
||||
```yaml
|
||||
arbitrum:
|
||||
rpc_endpoint: "https://arb-mainnet.g.alchemy.com/v2/YOUR_KEY" # Premium for critical
|
||||
fallback_endpoints:
|
||||
- url: "https://arb1.arbitrum.io/rpc" # Public for redundancy
|
||||
- url: "https://arbitrum-rpc.publicnode.com"
|
||||
- url: "https://1rpc.io/arb"
|
||||
```
|
||||
|
||||
**Cost:** Lower tier premium ($20-50/month) + free fallbacks
|
||||
|
||||
## Immediate Recommendations
|
||||
|
||||
### 🔴 CRITICAL - Choose One:
|
||||
|
||||
**A) Get Premium RPC Endpoint (Recommended for Production)**
|
||||
```bash
|
||||
# Quick start with Alchemy free tier (demo purposes)
|
||||
ARBITRUM_RPC_ENDPOINT=https://arb-mainnet.g.alchemy.com/v2/demo
|
||||
```
|
||||
|
||||
**B) Reduce Bot Scope for Public Endpoint Testing**
|
||||
Apply configuration changes in Option 2 above
|
||||
|
||||
### 🟡 URGENT - Monitor Performance
|
||||
|
||||
```bash
|
||||
# Watch 429 errors
|
||||
./scripts/dev-env.sh logs -f | grep "429"
|
||||
|
||||
# Count errors over time
|
||||
watch -n 10 'podman logs mev-bot-dev-master-dev 2>&1 | grep "429" | wc -l'
|
||||
|
||||
# Check arbitrage stats
|
||||
./scripts/dev-env.sh logs | grep "Arbitrage Service Stats" | tail -1
|
||||
```
|
||||
|
||||
### 🟢 RECOMMENDED - Optimize Configuration
|
||||
|
||||
Even with premium endpoint, optimize for efficiency:
|
||||
|
||||
1. **Disable Curve queries** - Most Arbitrum volume is Uniswap/Camelot
|
||||
2. **Increase cache times** - Reduce redundant queries
|
||||
3. **Tune scan frequency** - Balance speed vs resource usage
|
||||
|
||||
## Expected Results
|
||||
|
||||
### With Premium RPC Endpoint:
|
||||
- ✅ 95%+ reduction in 429 errors (< 20 errors in 21 minutes)
|
||||
- ✅ Full arbitrage scanning capability
|
||||
- ✅ Real-time opportunity detection
|
||||
- ✅ Competitive performance
|
||||
|
||||
### With Reduced Scope on Public Endpoint:
|
||||
- 🟡 50-70% reduction in 429 errors (~2,000 errors in 21 minutes)
|
||||
- 🟡 Limited arbitrage scanning
|
||||
- 🟡 Delayed opportunity detection
|
||||
- ❌ Not competitive for production MEV
|
||||
|
||||
## Cost-Benefit Analysis
|
||||
|
||||
### Premium RPC Endpoint
|
||||
**Cost:** $50-200/month
|
||||
**Benefit:**
|
||||
- Full bot functionality
|
||||
- Can detect $100-1000+/day in opportunities
|
||||
- **ROI:** Pays for itself on first successful trade
|
||||
|
||||
### Public Endpoint with Reduced Scope
|
||||
**Cost:** $0/month
|
||||
**Benefit:**
|
||||
- Testing and development
|
||||
- Learning and experimentation
|
||||
- Not suitable for production MEV
|
||||
- **ROI:** $0 (won't find profitable opportunities)
|
||||
|
||||
## Conclusion
|
||||
|
||||
**The bot is working correctly.** The issue is architectural mismatch between:
|
||||
- **Bot Design:** Built for premium RPC endpoints (100+ req/sec)
|
||||
- **Current Setup:** Using public endpoints (1-2 req/sec)
|
||||
|
||||
**Recommendation:**
|
||||
1. For production MEV: Get premium RPC endpoint ($50-200/month)
|
||||
2. For testing/development: Reduce bot scope with Option 2 config
|
||||
|
||||
**Next Action:**
|
||||
```bash
|
||||
# Decision needed from user:
|
||||
# A) Get premium endpoint and update .env
|
||||
# B) Apply reduced scope configuration for public endpoint testing
|
||||
```
|
||||
|
||||
The 5% improvement from rate limit changes shows the configuration is working, but it's not enough to bridge the 5x gap between what the bot needs and what public endpoints provide.
|
||||
Reference in New Issue
Block a user