8.0 KiB
Critical Fix Plan - November 1, 2025
Issues Identified & Solutions
🔴 ISSUE 1: Multi-Hop Scanner Finding 0 Paths
Root Cause:
The DFS search in multihop.go:208 calls GetAdjacentTokens(currentToken) but if the trigger token isn't in the pre-populated token graph, it returns an empty map and the search never starts.
Evidence:
[INFO] 📥 Received bridge arbitrage opportunity id=arb_1762011082_0xaf88d065 path_length=4 pools=0
[INFO] Multi-hop arbitrage scan completed in 99.983µs: found 0 profitable paths out of 0 total paths
^^^^^^^^
The issue!
The Flow:
- Opportunity comes in with start token (e.g., USDC
0xaf88d065...) ScanForArbitragecalled with this tokenupdateTokenGraphpopulates 8 hard-coded pools- DFS starts:
Get adjacent({0xaf88d065...}) - Token graph HAS this token, but...
- BUG: The DFS expects to find cycles but starts at depth=0 with current==target
- On first iteration (depth=0), it skips the "found cycle" check (requires depth>1)
- Gets adjacent tokens correctly
- But something else is wrong...
Actual Root Cause (Deeper): Looking at the logic more carefully:
// Line 199: If we're back at the start token and have made at least 2 hops
if depth > 1 && currentToken == targetToken {
path := mhs.createArbitragePath(currentTokens, currentPath, amount)
...
}
The issue is: The DFS is working, but createArbitragePath is returning nil for all paths!
Looking at createArbitragePath (line 238-260):
func (mhs *MultiHopScanner) createArbitragePath(...) *ArbitragePath {
if len(tokens) < 3 || len(pools) != len(tokens)-1 {
return nil // ← Validation fail
}
// Calculate swap outputs
for i, pool := range pools {
outputAmount, err := mhs.calculateSwapOutput(...)
if err != nil {
mhs.logger.Debug(...) // ← Silent failure!
return nil
}
}
}
The Real Problem:
- DFS finds paths (e.g., USDC → WETH → LINK → USDC)
createArbitragePathis calledcalculateSwapOutputtries to get pool reserves- But the pools have placeholder liquidity values! (line 485:
uint256.NewInt(1000000000000000000)) - Or
calculateSwapOutputfails due to missing SqrtPriceX96 data - Path creation fails silently
- Returns 0 paths
🔴 ISSUE 2: Security Manager Disabled
Status: CRITICAL - Running without transaction validation
Location: cmd/mev-bot/main.go:141
Fix: Uncomment security manager initialization
🔴 ISSUE 3: Rate Limiting (2,699 errors)
Root Cause: Single RPC endpoint being overwhelmed
Fix: Enable multi-provider failover from providers_runtime.yaml
🔴 ISSUE 4: Port Binding Conflicts (53 errors)
Root Cause: Multiple instances or improper cleanup
Fix: Add SO_REUSEADDR and pre-flight port checks
🔴 ISSUE 5: Context Cancellation (71 errors)
Root Cause: Improper shutdown handling
Fix: Add graceful shutdown with proper context handling
Fix Implementation Plan
Fix 1: Multi-Hop Scanner - Add Real Pool Data Fetching
File: pkg/arbitrage/multihop.go
Changes:
- Add DEBUG logging to
createArbitragePathto show why paths fail - Fetch real pool data (sqrtPriceX96, liquidity) from RPC in
updateTokenGraph - Add fallback: if RPC fetch fails, use DataFetcher or skip pool
- Add metrics to track: paths_found, paths_validated, paths_rejected
Code Addition:
// In createArbitragePath, add before return nil:
mhs.logger.Debug(fmt.Sprintf("❌ Path validation failed: tokens=%d pools=%d reason=%s",
len(tokens), len(pools), reason))
// In updateTokenGraph, fetch real data:
for _, pool := range pools {
// Fetch real pool state from RPC
slot0, err := mhs.fetchPoolSlot0(ctx, pool.Address)
if err != nil {
mhs.logger.Warn(fmt.Sprintf("Failed to fetch pool state for %s: %v", pool.Address, err))
continue // Skip this pool
}
pool.SqrtPriceX96 = slot0.SqrtPriceX96
pool.Liquidity = slot0.Liquidity
mhs.addPoolToGraph(pool)
}
Fix 2: Security Manager
File: cmd/mev-bot/main.go
Change: Uncomment lines 143-180 to re-enable security manager
Fix 3: Multi-Provider RPC
File: cmd/mev-bot/main.go or provider initialization
Change: Enable provider rotation with fallback
// Add after line 132
if providerConfigPath := os.Getenv("PROVIDER_CONFIG_PATH"); providerConfigPath != "" {
log.Info(fmt.Sprintf("Loading multi-provider configuration from: %s", providerConfigPath))
// Enable provider manager with failover
}
Fix 4: Port Binding
File: pkg/metrics/server.go (or equivalent)
Change:
listener, err := net.Listen("tcp", fmt.Sprintf(":%d", port))
// Change to:
lc := net.ListenConfig{
Control: func(network, address string, c syscall.RawConn) error {
return c.Control(func(fd uintptr) {
syscall.SetsockoptInt(int(fd), syscall.SOL_SOCKET, syscall.SO_REUSEADDR, 1)
})
},
}
listener, err := lc.Listen(ctx, "tcp", fmt.Sprintf(":%d", port))
Fix 5: Graceful Shutdown
File: cmd/mev-bot/main.go
Change: Add to shutdown handler (after line 400+):
// Create shutdown context with timeout
shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
defer shutdownCancel()
// Cancel main context
cancel()
// Wait for goroutines to finish with timeout
done := make(chan struct{})
go func() {
// Wait for all subsystems
wg.Wait()
close(done)
}()
select {
case <-done:
log.Info("Graceful shutdown completed")
case <-shutdownCtx.Done():
log.Warn("Shutdown timeout exceeded, forcing exit")
}
Implementation Priority
Phase 1: Critical Security (30 minutes)
- ✅ Re-enable security manager
- ✅ Add port reuse socket option
- ✅ Add graceful shutdown
Phase 2: Multi-Hop Scanner Fix (1-2 hours)
- ✅ Add detailed DEBUG logging to identify failure point
- ✅ Implement real pool data fetching in updateTokenGraph
- ✅ Add reserve cache integration
- ✅ Test with live data
Phase 3: RPC Optimization (1 hour)
- ✅ Enable multi-provider rotation
- ✅ Add exponential backoff
- ✅ Re-enable DataFetcher for batching
Phase 4: Testing & Validation (1 hour)
- ✅ Run bot for 10 minutes
- ✅ Verify no rate limiting errors
- ✅ Verify multi-hop scanner finds paths
- ✅ Verify opportunities are executed
- ✅ Check all metrics
Expected Outcomes
Before Fixes:
- ❌ 0 profitable paths found
- ❌ 2,699 rate limit errors
- ❌ Security disabled
- ❌ 53 port conflicts
- ❌ 71 context cancellations
After Fixes:
- ✅ 5-20 profitable paths per opportunity
- ✅ < 10 rate limit errors (99.6% reduction)
- ✅ Security enabled
- ✅ 0 port conflicts
- ✅ 0 context cancellations
- ✅ Actual arbitrage executions!
Testing Commands
# Phase 1: Build with fixes
make clean && make build
# Phase 2: Test startup (should see no errors)
timeout 30 ./mev-bot start 2>&1 | tee test_output.log
# Phase 3: Check for critical errors
grep -E "ERROR|FATAL|panic" test_output.log | wc -l # Should be 0
# Phase 4: Check multi-hop scanner
grep "profitable paths" test_output.log | tail -5 # Should show > 0 paths
# Phase 5: Full run (2 minutes)
timeout 120 ./mev-bot start 2>&1 | tee full_test.log
# Phase 6: Analyze results
./scripts/log-manager.sh analyze
Rollback Plan
If fixes cause issues:
git stash # Stash changes
git checkout 0b1c7bb # Return to last known good commit
make build && ./mev-bot start
Success Criteria
- Security manager enabled
- Multi-hop scanner finds > 0 paths
- Rate limit errors < 1% of previous
- No port binding errors
- No context cancellation errors
- At least 1 arbitrage execution attempt per minute
- Health score > 95/100
Next Step: Implement Phase 1 fixes (security critical)