# CRITICAL FIX PLAN: Zero Address Corruption **Date:** October 23, 2025 **Priority:** P0 - BLOCKS ALL PROFIT **Estimated Time:** 3-4 hours **Status:** 🔴 Ready to Implement --- ## 🎯 Problem Summary **100% of DEX transactions are rejected** due to zero address corruption in token extraction. **Root Cause:** The "enhanced parser" integration is incomplete. The L2 parser's `extractTokensFromMulticallData()` method **still calls the broken** `calldata.ExtractTokensFromMulticallWithContext()` from multicall.go, which returns zero addresses. --- ## 🔍 The Chain of Failure ### Current (Broken) Flow ``` 1. DEX Transaction Detected ✅ ↓ 2. Event Parser calls tokenExtractor.ExtractTokensFromMulticallData() ✅ ↓ 3. L2 Parser's extractTokensFromMulticallData() is called ✅ ↓ 4. ❌ L2 Parser calls calldata.ExtractTokensFromMulticallWithContext() ↓ 5. ❌ multicall.go's heuristic extraction returns empty addresses ↓ 6. ❌ Event has Token0=0x000..., Token1=0x000..., PoolAddress=0x000... ↓ 7. ❌ Event REJECTED (100% rejection rate) ``` ### The Smoking Gun **File:** `pkg/arbitrum/l2_parser.go:1408-1414` ```go func (p *ArbitrumL2Parser) extractTokensFromMulticallData(params []byte) (token0, token1 string) { tokens, err := calldata.ExtractTokensFromMulticallWithContext(params, &calldata.MulticallContext{ Stage: "arbitrum.l2_parser.extractTokensFromMulticallData", Protocol: "unknown", }) // ^^^ THIS IS THE PROBLEM! Still using broken multicall.go ``` **The Irony:** The L2 parser has perfectly good extraction methods for specific function signatures: - ✅ `extractTokensFromSwapExactTokensForTokens()` - WORKS - ✅ `extractTokensFromExactInputSingle()` - WORKS - ✅ `extractTokensFromSwapExactETHForTokens()` - WORKS But it's not using them! Instead, it calls the broken multicall.go code. --- ## ✅ The Solution ### Strategy: Bypass Broken Multicall.go Entirely Instead of trying to fix the complex heuristic extraction in multicall.go, we'll make the L2 parser's `extractTokensFromMulticallData()` decode the multicall structure and route to its own working extraction methods. ### Implementation **File:** `pkg/arbitrum/l2_parser.go` **Current Broken Method (lines 1408-1438):** ```go func (p *ArbitrumL2Parser) extractTokensFromMulticallData(params []byte) (token0, token1 string) { tokens, err := calldata.ExtractTokensFromMulticallWithContext(params, &calldata.MulticallContext{ Stage: "arbitrum.l2_parser.extractTokensFromMulticallData", Protocol: "unknown", }) // ... } ``` **New Working Method:** ```go func (p *ArbitrumL2Parser) extractTokensFromMulticallData(params []byte) (token0, token1 string) { // CRITICAL FIX: Decode multicall structure and route to working extraction methods // instead of calling broken multicall.go heuristics if len(params) < 32 { return "", "" } // Multicall format: offset (32 bytes) + length (32 bytes) + data array offset := new(big.Int).SetBytes(params[0:32]).Uint64() if offset >= uint64(len(params)) { return "", "" } // Read array length arrayLength := new(big.Int).SetBytes(params[offset:offset+32]).Uint64() if arrayLength == 0 { return "", "" } // Process each call in the multicall currentOffset := offset + 32 for i := uint64(0); i < arrayLength && i < 10; i++ { // Limit to first 10 calls if currentOffset + 32 > uint64(len(params)) { break } // Read call data offset callOffset := new(big.Int).SetBytes(params[currentOffset:currentOffset+32]).Uint64() currentOffset += 32 if callOffset >= uint64(len(params)) { continue } // Read call data length callLength := new(big.Int).SetBytes(params[callOffset:callOffset+32]).Uint64() callStart := callOffset + 32 callEnd := callStart + callLength if callEnd > uint64(len(params)) { continue } // Extract the actual call data callData := params[callStart:callEnd] if len(callData) < 4 { continue } // Try to extract tokens using our WORKING signature-based methods t0, t1, err := p.ExtractTokensFromCalldata(callData) if err == nil && t0 != (common.Address{}) && t1 != (common.Address{}) { return t0.Hex(), t1.Hex() } } return "", "" } ``` --- ## 📋 Step-by-Step Implementation ### Phase 1: Replace Broken Multicall Extraction (1-2 hours) 1. **Update `pkg/arbitrum/l2_parser.go:extractTokensFromMulticallData()`** - Replace calldata.ExtractTokensFromMulticallWithContext() call - Implement proper multicall decoding - Route to existing working extraction methods - Add detailed logging for debugging 2. **Add Enhanced Logging** ```go p.logger.Debug("Multicall extraction attempt", "array_length", arrayLength, "call_index", i, "function_sig", hex.EncodeToString(callData[:4])) ``` 3. **Add Universal Router Support** - UniversalRouter uses different multicall format - Add separate handling for function signature `0x3593564c` (execute) - Decode V3_SWAP_EXACT_IN, V2_SWAP_EXACT_IN commands ### Phase 2: Test & Validate (30 minutes) 1. **Unit Test** ```bash # Test with real multicall data from logs go test -v ./pkg/arbitrum -run TestExtractTokensFromMulticall ``` 2. **Integration Test** (1-minute run) ```bash make build timeout 60 ./bin/mev-bot start # Expected: >50% success rate (not 0%) ``` 3. **Validation Metrics** - Success rate > 70% - Zero address rejections < 30% - Valid Token0/Token1/PoolAddress in logs ### Phase 3: Add UniversalRouter Support (1 hour) UniversalRouter is the most common protocol (~60% of transactions) and uses a unique command-based format. **File:** `pkg/arbitrum/l2_parser.go` **Add Method:** ```go // extractTokensFromUniversalRouter decodes UniversalRouter execute() commands func (p *ArbitrumL2Parser) extractTokensFromUniversalRouter(params []byte) (token0, token1 common.Address, err error) { // UniversalRouter execute format: // bytes commands, bytes[] inputs, uint256 deadline if len(params) < 96 { return common.Address{}, common.Address{}, fmt.Errorf("params too short for universal router") } // Parse commands offset (first 32 bytes) commandsOffset := new(big.Int).SetBytes(params[0:32]).Uint64() // Parse inputs offset (second 32 bytes) inputsOffset := new(big.Int).SetBytes(params[32:64]).Uint64() if commandsOffset >= uint64(len(params)) || inputsOffset >= uint64(len(params)) { return common.Address{}, common.Address{}, fmt.Errorf("invalid offsets") } // Read commands length commandsLength := new(big.Int).SetBytes(params[commandsOffset:commandsOffset+32]).Uint64() commandsStart := commandsOffset + 32 // Read first command (V3_SWAP_EXACT_IN = 0x00, V2_SWAP_EXACT_IN = 0x08) if commandsStart >= uint64(len(params)) || commandsLength == 0 { return common.Address{}, common.Address{}, fmt.Errorf("no commands") } firstCommand := params[commandsStart] // Read inputs array inputsLength := new(big.Int).SetBytes(params[inputsOffset:inputsOffset+32]).Uint64() if inputsLength == 0 { return common.Address{}, common.Address{}, fmt.Errorf("no inputs") } // Read first input offset and data firstInputOffset := inputsOffset + 32 inputDataOffset := new(big.Int).SetBytes(params[firstInputOffset:firstInputOffset+32]).Uint64() if inputDataOffset >= uint64(len(params)) { return common.Address{}, common.Address{}, fmt.Errorf("invalid input offset") } inputDataLength := new(big.Int).SetBytes(params[inputDataOffset:inputDataOffset+32]).Uint64() inputDataStart := inputDataOffset + 32 inputDataEnd := inputDataStart + inputDataLength if inputDataEnd > uint64(len(params)) { return common.Address{}, common.Address{}, fmt.Errorf("input data out of bounds") } inputData := params[inputDataStart:inputDataEnd] // Decode based on command type switch firstCommand { case 0x00: // V3_SWAP_EXACT_IN // Format: recipient(addr), amountIn(uint256), amountOutMin(uint256), path(bytes), payerIsUser(bool) if len(inputData) >= 160 { // Path starts at offset 128 (4th parameter) pathOffset := new(big.Int).SetBytes(inputData[96:128]).Uint64() if pathOffset < uint64(len(inputData)) { pathLength := new(big.Int).SetBytes(inputData[pathOffset:pathOffset+32]).Uint64() pathStart := pathOffset + 32 // V3 path format: token0(20 bytes) + fee(3 bytes) + token1(20 bytes) if pathLength >= 43 && pathStart+43 <= uint64(len(inputData)) { token0 = common.BytesToAddress(inputData[pathStart:pathStart+20]) token1 = common.BytesToAddress(inputData[pathStart+23:pathStart+43]) return token0, token1, nil } } } case 0x08: // V2_SWAP_EXACT_IN // Format: recipient(addr), amountIn(uint256), amountOutMin(uint256), path(addr[]), payerIsUser(bool) if len(inputData) >= 128 { // Path array offset is at position 96 (4th parameter) pathOffset := new(big.Int).SetBytes(inputData[96:128]).Uint64() if pathOffset < uint64(len(inputData)) { pathArrayLength := new(big.Int).SetBytes(inputData[pathOffset:pathOffset+32]).Uint64() if pathArrayLength >= 2 { // First token token0 = common.BytesToAddress(inputData[pathOffset+32:pathOffset+64]) // Last token lastTokenOffset := pathOffset + 32 + (pathArrayLength-1)*32 if lastTokenOffset+32 <= uint64(len(inputData)) { token1 = common.BytesToAddress(inputData[lastTokenOffset:lastTokenOffset+32]) return token0, token1, nil } } } } } return common.Address{}, common.Address{}, fmt.Errorf("unsupported universal router command: 0x%02x", firstCommand) } ``` **Update ExtractTokensFromCalldata to support UniversalRouter:** ```go func (p *ArbitrumL2Parser) ExtractTokensFromCalldata(calldata []byte) (token0, token1 common.Address, err error) { if len(calldata) < 4 { return common.Address{}, common.Address{}, fmt.Errorf("calldata too short") } functionSignature := hex.EncodeToString(calldata[:4]) switch functionSignature { case "3593564c": // execute (UniversalRouter) return p.extractTokensFromUniversalRouter(calldata[4:]) case "38ed1739": // swapExactTokensForTokens return p.extractTokensFromSwapExactTokensForTokens(calldata[4:]) // ... rest of cases } } ``` ### Phase 4: Comprehensive Testing (30 minutes) 1. **5-Minute Production Run** ```bash make build timeout 300 ./bin/mev-bot start ``` 2. **Expected Results** - Success rate: 80-90% (up from 0%) - Valid events: ~120-150 per minute - Arbitrage opportunities: 1-5 per minute - Zero rejections: < 20% 3. **Log Analysis** ```bash # Count successes grep "Enhanced parsing success" logs/mev_bot.log | wc -l # Count rejections grep "REJECTED: Event with zero PoolAddress" logs/mev_bot.log | wc -l # Calculate success rate # Should be > 80% ``` --- ## 🔧 Additional Fixes Needed ### 1. Add Pool Address Discovery Currently, even with correct token extraction, PoolAddress is still zero because we're not querying the actual pool contracts. **Solution:** Add pool address lookup after token extraction: ```go // In event parser after successful token extraction if token0 != (common.Address{}) && token1 != (common.Address{}) { // Query factory to get pool address poolAddr := p.getPoolAddress(token0, token1, protocol) event.PoolAddress = poolAddr } ``` ### 2. Fix Event Creation Flow **File:** `pkg/events/parser.go` The event creation needs to properly use extracted tokens: ```go event := &Event{ Type: Swap, Protocol: protocol, PoolAddress: poolAddress, // ← Need to populate this Token0: token0, // ← These come from extraction Token1: token1, // ← These come from extraction TransactionHash: txHash, BlockNumber: blockNumber, Timestamp: timestamp, } ``` --- ## 📊 Success Metrics ### Before Fix - ❌ Success Rate: 0.00% - ❌ Valid Events: 0/minute - ❌ Opportunities: 0/minute - ❌ Revenue: $0/day ### After Fix (Expected) - ✅ Success Rate: 80-90% - ✅ Valid Events: 120-150/minute - ✅ Opportunities: 1-5/minute - ✅ Revenue: $100-1000/day (with execution) --- ## ⚠️ Risks & Mitigation ### Risk 1: Complex Multicall Formats **Impact:** Some complex multicalls may still fail **Mitigation:** Add fallback to heuristic for unknown formats **Acceptable:** 10-20% failure rate for edge cases ### Risk 2: UniversalRouter Command Variants **Impact:** Some UniversalRouter commands not supported **Mitigation:** Add logging for unsupported commands, implement incrementally **Acceptable:** Cover 80%+ of commands (V3_SWAP, V2_SWAP, WRAP_ETH) ### Risk 3: Protocol-Specific Differences **Impact:** Each DEX may have slight format variations **Mitigation:** Test against real transactions from logs **Acceptable:** 90%+ coverage of major DEXs (Uniswap, SushiSwap, TraderJoe, Camelot) --- ## 🚀 Deployment Plan ### Step 1: Implement Core Fix (2 hours) - Replace multicall extraction in L2 parser - Add comprehensive logging - Build and initial test ### Step 2: Add UniversalRouter Support (1 hour) - Implement execute() decoder - Handle V3_SWAP_EXACT_IN and V2_SWAP_EXACT_IN - Test with real Universal Router transactions ### Step 3: Validate (30 minutes) - Run 5-minute production test - Analyze success rate (target: >80%) - Check for any new error patterns ### Step 4: Commit & Document (30 minutes) - Commit changes with detailed message - Update TODO_AUDIT_FIX.md - Document any remaining issues --- ## 📝 Files to Modify 1. **`pkg/arbitrum/l2_parser.go`** (PRIMARY) - Replace extractTokensFromMulticallData() implementation - Add extractTokensFromUniversalRouter() method - Update ExtractTokensFromCalldata() with UniversalRouter case - Estimated changes: ~150 lines 2. **`pkg/events/parser.go`** (SECONDARY - if needed) - Verify token extractor is being called correctly - Add pool address lookup after extraction - Estimated changes: ~20 lines 3. **`pkg/arbitrum/l2_parser_test.go`** (NEW) - Add unit tests for multicall extraction - Test UniversalRouter decoding - Test with real transaction data from logs - Estimated: ~200 lines of tests --- ## ✅ Definition of Done - [ ] extractTokensFromMulticallData() no longer calls broken multicall.go - [ ] UniversalRouter execute() transactions are decoded correctly - [ ] Success rate > 80% in 5-minute production run - [ ] Zero address rejections < 20% - [ ] At least 1 arbitrage opportunity detected per minute - [ ] All changes committed with comprehensive message - [ ] Documentation updated with findings --- **Next Steps:** Begin implementation of Phase 1 **Estimated Total Time:** 3-4 hours **Priority:** P0 - Must fix before any profit can be generated **Status:** Ready to implement