Files
mev-beta/docs/IMPLEMENTATION_GUIDE_L2_OPTIMIZATIONS.md

578 lines
14 KiB
Markdown

# Layer 2 Optimizations Implementation Guide
**Created:** 2025-11-01
**Status:** Ready for Phase 1
**Risk Level:** Low (All changes are non-breaking)
---
## Overview
This guide provides step-by-step instructions for implementing Arbitrum-specific optimizations based on our comprehensive Layer 2 research. All changes are **non-breaking** and can be rolled back if needed.
---
## Quick Start
### Step 1: Review Research
Read: `docs/L2_MEV_BOT_RESEARCH_REPORT.md`
- Validates our current implementation ✅
- Identifies non-breaking improvements 🟡
- Provides competitive analysis 📊
### Step 2: Test Configuration
```bash
# Backup current config
cp config/arbitrum_production.yaml config/arbitrum_production.yaml.backup
# Test merge optimized config (dry run)
./scripts/validate-l2-config.sh --dry-run
# Apply Phase 1 optimizations
./scripts/apply-l2-optimizations.sh --phase 1
```
### Step 3: Monitor Results
```bash
# Watch live with L2-specific metrics
./scripts/watch-l2-metrics.sh
# Compare with baseline
./scripts/compare-performance.sh --baseline before --current after
```
---
## Phase-by-Phase Implementation
### Phase 1: Configuration Tuning (Week 1)
**Effort:** 1-2 hours | **Risk:** Low | **Reversible:** Yes
#### What's Changing
- `opportunity_ttl`: 30s → 5s (tuned for 250ms blocks)
- `max_path_age`: 60s → 10s (tuned for 250ms blocks)
- Add `execution_deadline`: 3s (new parameter)
#### Implementation Steps
**1. Enable Phase 1 in config:**
```yaml
# config/arbitrum_production.yaml
# Add at the end of the file:
# ===== LAYER 2 OPTIMIZATIONS (Phase 1) =====
features:
use_arbitrum_optimized_timeouts: true
use_dynamic_ttl: false # Start with static, enable later
arbitrage_optimized:
opportunity_ttl: "5s" # 20 blocks @ 250ms
max_path_age: "10s" # 40 blocks @ 250ms
execution_deadline: "3s" # 12 blocks @ 250ms
# Backward compatibility
legacy_opportunity_ttl: "30s" # For rollback
legacy_max_path_age: "60s" # For rollback
```
**2. Update code to read new config:**
```go
// internal/config/config.go
type ArbitrageOptimized struct {
OpportunityTTL time.Duration `yaml:"opportunity_ttl"`
MaxPathAge time.Duration `yaml:"max_path_age"`
ExecutionDeadline time.Duration `yaml:"execution_deadline"`
// Legacy values for rollback
LegacyOpportunityTTL time.Duration `yaml:"legacy_opportunity_ttl"`
LegacyMaxPathAge time.Duration `yaml:"legacy_max_path_age"`
}
type Features struct {
UseArbitrumOptimizedTimeouts bool `yaml:"use_arbitrum_optimized_timeouts"`
UseDynamicTTL bool `yaml:"use_dynamic_ttl"`
}
// In Config struct
type Config struct {
// ... existing fields ...
Features Features `yaml:"features"`
ArbitrageOptimized ArbitrageOptimized `yaml:"arbitrage_optimized"`
}
// Helper to get active TTL
func (c *Config) GetOpportunityTTL() time.Duration {
if c.Features.UseArbitrumOptimizedTimeouts {
return c.ArbitrageOptimized.OpportunityTTL
}
return c.Arbitrage.OpportunityTTL // Legacy
}
```
**3. Update arbitrage service:**
```go
// pkg/arbitrage/service.go
func (s *ArbitrageService) isOpportunityValid(opp *types.ArbitrageOpportunity) bool {
// Use configurable TTL
ttl := s.config.GetOpportunityTTL()
age := time.Since(opp.Timestamp)
if age > ttl {
s.logger.Debug(fmt.Sprintf(
"Opportunity expired: age=%s, ttl=%s",
age, ttl,
))
return false
}
return true
}
```
**4. Test Phase 1:**
```bash
# Start bot with Phase 1 config
PROVIDER_CONFIG_PATH=$PWD/config/providers_runtime.yaml timeout 60 ./mev-bot start
# Monitor logs for:
# - Opportunities detected
# - Opportunities expired (should see more due to shorter TTL)
# - Execution attempts
# - Success rate
```
**5. Validate Results:**
```bash
# After 1 hour, compare metrics
./scripts/analyze-l2-phase1.sh
# Check for:
# - Reduced stale opportunity execution ✅
# - Similar or better success rate ✅
# - No increase in errors ✅
```
**6. Rollback if Needed:**
```yaml
# Set in config/arbitrum_production.yaml
features:
use_arbitrum_optimized_timeouts: false # Back to legacy
```
---
### Phase 2: Transaction Pre-filtering (Week 2)
**Effort:** 4-6 hours | **Risk:** Medium | **Reversible:** Yes
#### What's Changing
- Filter non-DEX transactions at monitor level
- Expected 80-90% reduction in processed transactions
- Improved latency and reduced CPU usage
#### Implementation Steps
**1. Create DEX filter module:**
```go
// pkg/monitor/dex_filter.go
package monitor
import (
"encoding/hex"
"sync"
"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/types"
)
type DEXFilter struct {
knownDEXAddresses map[common.Address]bool
swapSignatures map[string]bool
mu sync.RWMutex
// Statistics
totalTx uint64
filteredTx uint64
passedTx uint64
}
func NewDEXFilter(dexAddresses []common.Address, swapSigs []string) *DEXFilter {
filter := &DEXFilter{
knownDEXAddresses: make(map[common.Address]bool),
swapSignatures: make(map[string]bool),
}
// Build lookup maps
for _, addr := range dexAddresses {
filter.knownDEXAddresses[addr] = true
}
for _, sig := range swapSigs {
filter.swapSignatures[sig] = true
}
return filter
}
func (f *DEXFilter) ShouldProcess(tx *types.Transaction) bool {
f.mu.Lock()
f.totalTx++
f.mu.Unlock()
// Must have a recipient
if tx.To() == nil {
f.incrementFiltered()
return false
}
// Check if recipient is known DEX
if f.knownDEXAddresses[*tx.To()] {
f.incrementPassed()
return true
}
// Check function signature
if len(tx.Data()) >= 4 {
sig := hex.EncodeToString(tx.Data()[:4])
if f.swapSignatures[sig] {
f.incrementPassed()
return true
}
}
f.incrementFiltered()
return false
}
func (f *DEXFilter) GetStats() (total, filtered, passed uint64) {
f.mu.RLock()
defer f.mu.RUnlock()
return f.totalTx, f.filteredTx, f.passedTx
}
func (f *DEXFilter) incrementFiltered() {
f.mu.Lock()
f.filteredTx++
f.mu.Unlock()
}
func (f *DEXFilter) incrementPassed() {
f.mu.Lock()
f.passedTx++
f.mu.Unlock()
}
```
**2. Integrate into monitor:**
```go
// pkg/monitor/concurrent.go
type ArbitrumMonitor struct {
// ... existing fields ...
dexFilter *DEXFilter
filterEnabled bool
}
func (m *ArbitrumMonitor) processTransaction(tx *types.Transaction) {
// Apply filter if enabled
if m.filterEnabled && !m.dexFilter.ShouldProcess(tx) {
// Log occasionally (1% sample rate)
if rand.Float64() < 0.01 {
m.logger.Debug(fmt.Sprintf(
"Filtered non-DEX tx: %s to %s",
tx.Hash().Hex()[:10],
tx.To().Hex()[:10],
))
}
return
}
// Process as normal
m.processSwapTransaction(tx)
}
// Periodic stats logging
func (m *ArbitrumMonitor) logFilterStats() {
total, filtered, passed := m.dexFilter.GetStats()
filterRate := float64(filtered) / float64(total) * 100
m.logger.Info(fmt.Sprintf(
"DEX Filter Stats: total=%d, passed=%d (%.1f%%), filtered=%d (%.1f%%)",
total, passed, 100-filterRate, filtered, filterRate,
))
}
```
**3. Enable Phase 2:**
```yaml
# config/arbitrum_production.yaml
features:
enable_dex_prefilter: true # Enable filtering
log_filtered_transactions: true # Log for monitoring
dex_filter:
enabled: true
filter_mode: "whitelist"
log_filtered: true
filtered_log_sample_rate: 0.01 # Log 1%
```
**4. Test and Monitor:**
```bash
# Start with filtering enabled
PROVIDER_CONFIG_PATH=$PWD/config/providers_runtime.yaml timeout 60 ./mev-bot start
# Watch filter statistics
tail -f logs/mev_bot.log | grep "DEX Filter Stats"
# Should see ~80-90% filtering rate
# Example: filtered=890, passed=110 (11%)
```
**5. Validate No Missed Opportunities:**
```bash
# Check that we're not filtering profitable transactions
./scripts/validate-filter-accuracy.sh
# Reviews:
# - Opportunities before filtering: X
# - Opportunities after filtering: Y
# - Difference should be <1%
```
---
### Phase 3: Sequencer Feed (Week 3)
**Effort:** 8-12 hours | **Risk:** Medium | **Reversible:** Yes
#### Status
⏸️ **Defer to Phase 3** - Requires more extensive testing
#### Reason
- Direct sequencer feed monitoring requires careful testing
- Risk of connection issues impacting operations
- Phase 1 & 2 provide significant improvements already
#### Future Implementation
When ready for Phase 3, see `docs/L2_MEV_BOT_RESEARCH_REPORT.md` Section 3.2 for detailed implementation plan.
---
### Phase 4-5: Timeboost (Month 2+)
**Effort:** 16-24 hours | **Risk:** High | **Reversible:** Partial
#### Status
🔮 **Future Feature** - Only if competition demands
#### Decision Criteria
Implement Timeboost if:
1. ✅ Phases 1-2 deployed successfully
2. ✅ Consistently profitable for >1 month
3. ✅ Evidence of opportunities being sniped by express lane users
4. ✅ Average opportunity profit >$100
5. ✅ Sufficient capital for express lane bidding ($1000+ reserved)
#### Implementation
See `docs/L2_MEV_BOT_RESEARCH_REPORT.md` Section 3.1 for complete Timeboost integration guide.
---
## Testing Checklist
### Pre-Deployment Tests
- [ ] Config validates without errors
- [ ] All DEX addresses are correct
- [ ] Swap signatures are complete
- [ ] Backward compatibility config present
- [ ] Rollback procedure tested
### Phase 1 Tests
- [ ] Opportunities expire faster (5s vs 30s)
- [ ] No increase in error rate
- [ ] Similar or better success rate
- [ ] Reduced stale opportunity execution
- [ ] System remains stable
### Phase 2 Tests
- [ ] 80-90% transaction filtering rate
- [ ] No missed DEX transactions
- [ ] Reduced CPU usage
- [ ] Improved latency
- [ ] Filter stats logging works
- [ ] Sample logging at correct rate (1%)
### Performance Tests
- [ ] Load test with 1000+ tx/sec
- [ ] Memory usage stable
- [ ] No goroutine leaks
- [ ] Latency within targets (<200ms)
- [ ] Success rate maintained or improved
---
## Monitoring & Metrics
### Key Metrics to Track
**Phase 1 (Timing):**
- Opportunity TTL hits (count)
- Average opportunity age at execution
- Stale opportunity rejections
- Execution success rate
**Phase 2 (Filtering):**
- Total transactions processed
- Transactions filtered (%)
- Transactions passed (%)
- CPU usage (before/after)
- Memory usage (before/after)
- Average detection latency
**Comparative:**
- Opportunities detected (before/after)
- Opportunities executed (before/after)
- Total profit (before/after)
- Success rate (before/after)
### Monitoring Commands
```bash
# Real-time L2-specific monitoring
./scripts/watch-l2-metrics.sh
# Generate performance report
./scripts/generate-l2-report.sh --period 24h
# Compare with baseline
./scripts/compare-performance.sh \
--baseline logs/baseline_metrics.json \
--current logs/current_metrics.json
```
---
## Rollback Procedures
### Emergency Rollback (Immediate)
If critical issues detected:
```bash
# Stop bot
pkill mev-bot
# Restore backup config
cp config/arbitrum_production.yaml.backup config/arbitrum_production.yaml
# Restart
PROVIDER_CONFIG_PATH=$PWD/config/providers_runtime.yaml ./mev-bot start
```
### Feature-Specific Rollback
**Phase 1:**
```yaml
features:
use_arbitrum_optimized_timeouts: false
```
**Phase 2:**
```yaml
features:
enable_dex_prefilter: false
```
### Automatic Rollback
The system includes automatic rollback on high failure rate:
```yaml
legacy_config:
auto_rollback_on_failure: true
rollback_threshold_failures: 10 # After 10 consecutive failures
```
---
## Success Criteria
### Phase 1 Success
- ✅ Reduced stale opportunity execution by >50%
- ✅ Maintained or improved success rate
- ✅ No increase in error rate
- ✅ System stability maintained
### Phase 2 Success
- ✅ 80-90% transaction filtering achieved
- ✅ <1% missed DEX transactions
- ✅ >30% reduction in CPU usage
- ✅ >20% improvement in detection latency
- ✅ No degradation in opportunity detection
### Overall Success
- ✅ Maintained profitability
- ✅ Improved competitive position
- ✅ Reduced resource usage
- ✅ Better alignment with L2 characteristics
- ✅ No breaking changes or downtime
---
## Troubleshooting
### Issue: Increased Opportunity Expiration
**Symptom:** Many opportunities expiring before execution
**Cause:** TTL too short (5s might be aggressive)
**Fix:**
```yaml
arbitrage_optimized:
opportunity_ttl: "7s" # Increase to 7s (28 blocks)
```
### Issue: Filter Missing Opportunities
**Symptom:** Fewer opportunities detected with filter enabled
**Fix:**
1. Check filtered transaction logs
2. Identify missed DEX addresses or signatures
3. Add to filter configuration
4. Redeploy
### Issue: High CPU Usage with Filter
**Symptom:** CPU usage higher than expected
**Cause:** Inefficient filter lookups
**Fix:**
```yaml
dex_filter:
cache_lookups: true
cache_ttl: "5m"
```
---
## Next Steps After Deployment
1. **Week 1:** Deploy Phase 1, monitor for 7 days
2. **Week 2:** If Phase 1 successful, deploy Phase 2
3. **Week 3:** Monitor combined Phase 1+2 performance
4. **Week 4:** Gather data for Phase 3 decision
5. **Month 2+:** Evaluate Timeboost based on competition
---
## Support & Documentation
- **Research Report:** `docs/L2_MEV_BOT_RESEARCH_REPORT.md`
- **Configuration:** `config/arbitrum_optimized.yaml`
- **Scripts:** `scripts/l2-*.sh`
- **Monitoring:** `scripts/watch-l2-metrics.sh`
---
**Status:** ✅ Ready for Phase 1 Deployment
**Risk Level:** 🟢 Low (Non-Breaking Changes)
**Estimated Impact:** 📈 20-30% Performance Improvement