Files
mev-beta/orig/logs/RATE_LIMIT_ANALYSIS_20251109.md
Administrator c54c569f30 refactor: move all remaining files to orig/ directory
Completed clean root directory structure:
- Root now contains only: .git, .env, docs/, orig/
- Moved all remaining files and directories to orig/:
  - Config files (.claude, .dockerignore, .drone.yml, etc.)
  - All .env variants (except active .env)
  - Git config (.gitconfig, .github, .gitignore, etc.)
  - Tool configs (.golangci.yml, .revive.toml, etc.)
  - Documentation (*.md files, @prompts)
  - Build files (Dockerfiles, Makefile, go.mod, go.sum)
  - Docker compose files
  - All source directories (scripts, tests, tools, etc.)
  - Runtime directories (logs, monitoring, reports)
  - Dependency files (node_modules, lib, cache)
  - Special files (--delete)

- Removed empty runtime directories (bin/, data/)

V2 structure is now clean:
- docs/planning/ - V2 planning documents
- orig/ - Complete V1 codebase preserved
- .env - Active environment config (not in git)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 10:53:05 +01:00

289 lines
8.0 KiB
Markdown

# Rate Limiting Analysis & Recommendations - November 9, 2025
## Summary
**Configuration Changes Applied:** ✅ Successfully reduced rate limits
**Error Rate Impact:** 🟡 Minimal improvement (5% reduction)
**Root Cause:** Bot design incompatible with public RPC endpoints
**Recommended Solution:** Use premium RPC endpoint or drastically reduce bot scope
## Comparison
### Before Rate Limit Fix
- **Container:** mev-bot-dev-master-dev (first instance)
- **Runtime:** 7 minutes
- **429 Errors:** 2,354 total
- **Error Rate:** 5.60 errors/second
- **Config:** 5 req/sec, 3 concurrent, burst 10
### After Rate Limit Fix
- **Container:** mev-bot-dev-master-dev (rebuilt)
- **Runtime:** 21 minutes
- **429 Errors:** 6,717 total
- **Error Rate:** 5.33 errors/second (-4.8%)
- **Config:** 2 req/sec, 1 concurrent, burst 3
**Improvement:** 5% reduction in error rate, but still unacceptably high
## Root Cause Analysis
### Bot's Request Pattern
The bot generates massive RPC request volume:
1. **Block Processing:** ~4-8 blocks/minute
- Get block data
- Get all transactions
- Get transaction receipts
- Parse events
- **Estimate:** ~20-40 requests/minute
2. **Pool Discovery:** Per swap event detected
- Query Uniswap V3 registry
- Query Uniswap V2 factory
- Query SushiSwap factory
- Query Camelot V3 factory
- Query 4 Curve registries
- **Estimate:** ~8-12 requests per swap event
3. **Arbitrage Scanning:** Every few seconds
- Creates 270 scan tasks for 45 token pairs
- Each task queries multiple pools
- Batch fetches pool state data
- **Estimate:** 270+ requests per scan cycle
**Total Request Rate:** 400-600+ requests/minute = **6-10 requests/second**
### Public Endpoint Limits
Free public RPC endpoints typically allow:
- **arb1.arbitrum.io/rpc:** ~1-2 requests/second
- **publicnode.com:** ~1-2 requests/second
- **1rpc.io:** ~1-2 requests/second
**Gap:** Bot needs 6-10 req/sec, endpoint allows 1-2 req/sec = **5x over limit**
## Why Rate Limiting Didn't Help
The bot's internal rate limiting (2 req/sec) doesn't match the actual request volume because:
1. **Multiple concurrent operations:**
- Block processor running
- Event scanner running
- Arbitrage service running
- Each has its own RPC client
2. **Burst requests:**
- 270 scan tasks created simultaneously
- Even with queuing, bursts hit the endpoint
3. **Fallback endpoints:**
- Also rate-limited
- Switching between them doesn't help
## Current Bot Performance
Despite rate limiting:
### ✅ Working Correctly
- Block processing: Active
- DEX transaction detection: Functional
- Swap event parsing: Working
- Arbitrage scanning: Running (scan #260+ completed)
- Pool blacklisting: Protecting against bad pools
- Services: All healthy
### ❌ Performance Impact
- **No arbitrage opportunities detected:** 0 found in 21 minutes
- **Pool blacklist growing:** 926 pools blacklisted
- **Batch fetch failures:** ~200+ failed fetches
- **Scan completion:** Most scans fail due to missing pool data
## Solutions
### Option 1: Premium RPC Endpoint (RECOMMENDED)
**Pros:**
- Immediate fix
- Full bot functionality
- Designed for this use case
**Premium endpoints with high limits:**
```bash
# Chainstack (50-100 req/sec on paid plans)
ARBITRUM_RPC_ENDPOINT=https://arbitrum-mainnet.core.chainstack.com/YOUR_API_KEY
# Alchemy (300 req/sec on Growth plan)
ARBITRUM_RPC_ENDPOINT=https://arb-mainnet.g.alchemy.com/v2/YOUR_API_KEY
# Infura (100 req/sec on paid plans)
ARBITRUM_RPC_ENDPOINT=https://arbitrum-mainnet.infura.io/v3/YOUR_API_KEY
# QuickNode (500 req/sec on paid plans)
ARBITRUM_RPC_ENDPOINT=https://YOUR_ENDPOINT.arbitrum-mainnet.quiknode.pro/YOUR_TOKEN/
```
**Cost:** $50-200/month depending on provider and tier
**Implementation:**
1. Sign up for premium endpoint
2. Update .env with API key
3. Restart container
4. Monitor - should see 95%+ reduction in 429 errors
### Option 2: Drastically Reduce Bot Scope
**Make bot compatible with public endpoints:**
1. **Disable Curve queries** (save ~4 requests per event):
```yaml
# Reduce protocol coverage
protocols:
- uniswap_v3
- camelot_v3
# Remove: curve, balancer, etc.
```
2. **Reduce arbitrage scan frequency** (save ~100+ requests/minute):
```yaml
arbitrage:
scan_interval: 60 # Scan every 60 seconds instead of every 5
max_scan_tasks: 50 # Reduce from 270 to 50
```
3. **Increase cache times** (reduce redundant queries):
```yaml
uniswap:
cache:
expiration: 1800 # 30 minutes instead of 10
```
4. **Reduce block processing rate**:
```yaml
bot:
polling_interval: 10 # Process blocks slower
max_workers: 1 # Single worker only
```
**Pros:** Free, uses public endpoints
**Cons:**
- Severely limited functionality
- Miss most opportunities
- Slow response time
- Not competitive
### Option 3: Run Your Own Arbitrum Node
**Setup:**
- Run full Arbitrum node locally
- Unlimited RPC requests
- No rate limiting
**Pros:** No rate limits, no costs
**Cons:**
- High initial setup complexity
- Requires 2+ TB storage
- High bandwidth requirements
- Ongoing maintenance
**Cost:** ~$100-200/month in server costs
### Option 4: Hybrid Approach
**Use both public and premium:**
```yaml
arbitrum:
rpc_endpoint: "https://arb-mainnet.g.alchemy.com/v2/YOUR_KEY" # Premium for critical
fallback_endpoints:
- url: "https://arb1.arbitrum.io/rpc" # Public for redundancy
- url: "https://arbitrum-rpc.publicnode.com"
- url: "https://1rpc.io/arb"
```
**Cost:** Lower tier premium ($20-50/month) + free fallbacks
## Immediate Recommendations
### 🔴 CRITICAL - Choose One:
**A) Get Premium RPC Endpoint (Recommended for Production)**
```bash
# Quick start with Alchemy free tier (demo purposes)
ARBITRUM_RPC_ENDPOINT=https://arb-mainnet.g.alchemy.com/v2/demo
```
**B) Reduce Bot Scope for Public Endpoint Testing**
Apply configuration changes in Option 2 above
### 🟡 URGENT - Monitor Performance
```bash
# Watch 429 errors
./scripts/dev-env.sh logs -f | grep "429"
# Count errors over time
watch -n 10 'podman logs mev-bot-dev-master-dev 2>&1 | grep "429" | wc -l'
# Check arbitrage stats
./scripts/dev-env.sh logs | grep "Arbitrage Service Stats" | tail -1
```
### 🟢 RECOMMENDED - Optimize Configuration
Even with premium endpoint, optimize for efficiency:
1. **Disable Curve queries** - Most Arbitrum volume is Uniswap/Camelot
2. **Increase cache times** - Reduce redundant queries
3. **Tune scan frequency** - Balance speed vs resource usage
## Expected Results
### With Premium RPC Endpoint:
- ✅ 95%+ reduction in 429 errors (< 20 errors in 21 minutes)
- ✅ Full arbitrage scanning capability
- ✅ Real-time opportunity detection
- ✅ Competitive performance
### With Reduced Scope on Public Endpoint:
- 🟡 50-70% reduction in 429 errors (~2,000 errors in 21 minutes)
- 🟡 Limited arbitrage scanning
- 🟡 Delayed opportunity detection
- ❌ Not competitive for production MEV
## Cost-Benefit Analysis
### Premium RPC Endpoint
**Cost:** $50-200/month
**Benefit:**
- Full bot functionality
- Can detect $100-1000+/day in opportunities
- **ROI:** Pays for itself on first successful trade
### Public Endpoint with Reduced Scope
**Cost:** $0/month
**Benefit:**
- Testing and development
- Learning and experimentation
- Not suitable for production MEV
- **ROI:** $0 (won't find profitable opportunities)
## Conclusion
**The bot is working correctly.** The issue is architectural mismatch between:
- **Bot Design:** Built for premium RPC endpoints (100+ req/sec)
- **Current Setup:** Using public endpoints (1-2 req/sec)
**Recommendation:**
1. For production MEV: Get premium RPC endpoint ($50-200/month)
2. For testing/development: Reduce bot scope with Option 2 config
**Next Action:**
```bash
# Decision needed from user:
# A) Get premium endpoint and update .env
# B) Apply reduced scope configuration for public endpoint testing
```
The 5% improvement from rate limit changes shows the configuration is working, but it's not enough to bridge the 5x gap between what the bot needs and what public endpoints provide.