# MEV Bot Codebase Architecture Analysis ## Executive Summary The MEV (Maximal Extractable Value) bot is a sophisticated arbitrage detection and execution system built in Go for the Arbitrum blockchain. It implements a modular, multi-layered architecture with clear separation of concerns across transaction monitoring, opportunity detection, validation, and execution components. **Current Status**: The system is operational with comprehensive MEV detection capabilities, supporting multiple DEX protocols with advanced mathematical optimization and real-time market monitoring. --- ## 1. CORE WORKFLOW - Transaction to Execution Flow ### High-Level Execution Path ``` ┌─────────────────────────────────────────────────────────────────┐ │ 1. ARBITRUM SEQUENCER MONITORING (pkg/monitor/concurrent.go) │ │ └─ Real-time block polling (3-second intervals) │ │ └─ L2 transaction parsing via ArbitrumL2Parser │ │ └─ DEX transaction detection (function signature matching) │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 2. EVENT EXTRACTION & ENRICHMENT │ │ └─ Transaction receipt parsing │ │ └─ DEX event signature detection (Swap, Mint, Burn) │ │ └─ Token address extraction from calldata │ │ └─ Event validation (zero-address check) │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 3. SCANNER SUBMISSION & ROUTING (pkg/scanner/concurrent.go) │ │ └─ Worker pool distribution (10 concurrent workers) │ │ └─ Pool address validation │ │ └─ Duplicate address detection │ │ └─ Suspicious address filtering │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 4. MARKET ANALYSIS (pkg/scanner/market/scanner.go) │ │ └─ Swap event analysis │ │ └─ Liquidity event analysis │ │ └─ Pool state tracking │ │ └─ Reserve cache invalidation (event-driven) │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 5. ARBITRAGE DETECTION (pkg/arbitrage/service.go) │ │ └─ Multi-hop path scanning │ │ └─ Opportunity ranking and filtering │ │ └─ Profit calculation and ROI assessment │ │ └─ Confidence scoring │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 6. VALIDATION LAYER (pkg/validation/*) │ │ └─ Price impact validation │ │ └─ Gas cost estimation │ │ └─ Slippage tolerance checking │ │ └─ Profitability verification │ └────────────────────────┬────────────────────────────────────────┘ │ ┌────────────────────────▼────────────────────────────────────────┐ │ 7. EXECUTION ENGINE (pkg/execution/executor.go) │ │ └─ Flash loan coordination (Aave/Uniswap/Balancer) │ │ └─ Transaction simulation (if enabled) │ │ └─ On-chain execution │ │ └─ Result tracking and metrics │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## 2. KEY COMPONENTS & RESPONSIBILITIES ### 2.1 Transaction Monitoring Layer **Component**: `ArbitrumMonitor` (pkg/monitor/concurrent.go) **Responsibilities**: - Establishes persistent connection to Arbitrum sequencer - Polls latest blocks every 3 seconds - Parses DEX transactions using ArbitrumL2Parser - Extracts transaction calldata and function signatures - Subscribes to DEX contract events in real-time **Critical Features**: - Connection health checker (30-second intervals) - Automatic RPC failover via ConnectionManager - Rate limiting (configurable RPS/burst) - Transaction receipt parsing for swap event extraction **Data Structure**: ```go type DEXTransaction struct { Hash string From string To string ContractName string FunctionName string FunctionSig string Protocol string InputData []byte BlockNumber uint64 } ``` --- ### 2.2 Event Parsing & Extraction **Component**: `EventParser` (pkg/events/event_parser.go) **Responsibilities**: - Parses transaction receipts into typed events - Extracts token addresses from transaction calldata - Validates event data integrity - Creates enriched event objects with token metadata **Critical Methods**: - `ParseTransactionReceipt()` - Extracts events from receipt logs - `ParseTransactionReceiptWithTx()` - Enhanced parsing with calldata analysis - Token extraction via contract calls (token0/token1 functions) **Supported Events**: - Swap (Uniswap V2/V3, SushiSwap, Curve, etc.) - AddLiquidity (mint events) - RemoveLiquidity (burn events) - NewPool --- ### 2.3 Market Scanner & Event Processing **Component**: `Scanner` (pkg/scanner/concurrent.go) **Responsibilities**: - Manages worker pool for concurrent event processing - Validates event data before processing - Routes events to appropriate analyzers - Implements cache invalidation strategy **Architecture**: ``` Worker Pool (10 workers) ├─ Worker 0: [Job Channel] ──→ Process Event ├─ Worker 1: [Job Channel] ──→ Process Event ├─ Worker 2: [Job Channel] ──→ Process Event └─ ...Worker N ``` **Validation Gates**: 1. **Pool Address Check**: Rejects zero addresses 2. **Duplicate Address Detection**: Pool ≠ Token0 ≠ Token1 3. **Zero-Padding Detection**: Rejects suspicious addresses (0x0000...xxxx) 4. **Event Type Routing**: Directs to swap/liquidity analyzers **Event-Driven Cache Invalidation**: - Swap events invalidate reserve caches for affected pools - Liquidity changes trigger full pool state refresh - Ensures profit calculations use fresh data --- ### 2.4 Arbitrage Detection Engine **Component**: `ArbitrageDetectionEngine` (pkg/arbitrage/detection_engine.go) **Responsibilities**: - Continuously scans for profitable paths - Implements multi-hop path analysis (up to 3 hops) - Filters opportunities by profit threshold and ROI - Provides real-time opportunity feed to execution **Detection Algorithm**: ``` 1. Input: Token pair + Swap event 2. For each token in pair: ├─ Determine scan amount (10% of swap size) ├─ Use MultiHopScanner to find all paths ├─ Calculate profit for each path └─ Filter by: MinProfit, MinROI, MaxSlippage 3. Rank opportunities by: ROI, Profit, Urgency 4. Return top N opportunities ``` **Configuration**: ```go DetectionConfig struct { ScanInterval time.Duration // 5 seconds default MaxConcurrentScans int // 5 parallel scans MaxConcurrentPaths int // 10 parallel path checks MinProfitThreshold *UniversalDecimal // 0.001 ETH default MaxPriceImpact *UniversalDecimal // 2% default MaxHops int // 3 default CacheTTL time.Duration // 5 minutes default } ``` --- ### 2.5 Arbitrage Service - Multi-Hop Scanning **Component**: `ArbitrageService` (pkg/arbitrage/service.go) **Responsibilities**: - Main orchestration layer combining all components - Executes multi-hop scanner for path discovery - Manages opportunity execution pipeline - Tracks statistics and metrics **Key Methods**: - `Start()` - Initializes background monitoring goroutines - `ProcessSwapEvent()` - Entry point for swap event processing - `ExecuteOpportunity()` - Dispatches opportunity to executor - `GetStats()` - Returns real-time statistics **Multi-Hop Scanner Behavior**: ``` Input: Significant Swap Event (e.g., WETH/USDC) └─ Triggers path scanning for both WETH and USDC For each token: ├─ Scan 1: Token → X → Token (triangular arbitrage) ├─ Scan 2: Token → X → Y → Token (3-hop arbitrage) └─ Scan 3: Token → X → Y → Z → Token (4-hop if enabled) Output: List of profitable paths ranked by ROI ``` --- ### 2.6 Execution Layer **Component**: `ArbitrageExecutor` (pkg/arbitrage/executor.go) **Responsibilities**: - Validates opportunities before execution - Manages gas estimation and cost calculation - Coordinates flash loan providers - Submits transactions on-chain - Tracks execution results **Execution Modes**: ```go const ( SimulationMode // Test on fork DryRunMode // Validate without sending LiveMode // Real on-chain execution ) ``` **Flash Loan Support**: - Aave Flash Loans - Uniswap V3 Flash Swaps - Balancer Flash Loans --- ## 3. DATA FLOW ANALYSIS ### 3.1 Block to Opportunity Flow ``` Block N │ ├─→ ArbitrumMonitor.processBlock() │ ├─ Fetch block via L2Parser │ ├─ Parse DEX transactions (filter by function sigs) │ └─ Extract 5-20 DEX transactions per block │ ├─→ For each DEX transaction: │ ├─ Fetch transaction receipt │ ├─ EventParser.ParseTransactionReceipt() │ │ ├─ Detect swap event signatures │ │ ├─ Extract token addresses │ │ └─ Create Event objects │ │ │ └─ Scanner.SubmitEvent() → Event Validation │ ├─ Check zero addresses │ ├─ Check duplicate addresses │ ├─ Check zero-padding │ └─ Enqueue to worker pool │ ├─→ EventWorker.Process() [10 concurrent] │ ├─ Invalidate reserve cache │ └─ Route to SwapAnalyzer │ ├─→ SwapAnalyzer.AnalyzeSwapEvent() │ ├─ Extract swap amounts and tokens │ └─ Trigger arbitrage detection │ └─→ ArbitrageService.DetectArbitrageOpportunities() ├─ ScanForArbitrage(token, amount) ├─ Find profitable paths └─ Execute top opportunities ``` ### 3.2 Configuration Dependency Flow ``` cmd/mev-bot/main.go │ ├─→ Load Configuration (YAML + Environment Variables) │ ├─ config/arbitrum_production.yaml │ ├─ .env.production │ └─ Environment variable overrides │ ├─→ Validate RPC Endpoints │ ├─→ Initialize Components: │ ├─ Provider Manager (unified RPC management) │ ├─ Security Manager (key management, rate limiting) │ ├─ Key Manager (transaction signing) │ ├─ Pool Discovery (cache-based pool metadata) │ └─ Token Metadata Cache (token information) │ ├─→ Create ArbitrageService │ ├─ Initialize MultiHopScanner │ ├─ Initialize ArbitrageExecutor │ ├─ Initialize DetectionEngine │ ├─ Initialize FlashSwapExecutor │ └─ Initialize LiveExecutionFramework │ ├─→ Start Monitoring Components: │ ├─ ArbitrumMonitor (blockchain reading) │ ├─ Stats Updater (30s metrics) │ ├─ Market Data Syncer (10s sync) │ └─ Dashboard Server (port 8080) │ └─→ Main Event Loop (Wait for signals) ``` --- ## 4. CRITICAL EXECUTION PATHS ### 4.1 Path 1: Transaction Detection → Execution **Trigger**: New block with DEX transaction **Duration**: 3-30 seconds (block to execution) ``` 1. ArbitrumMonitor polls new block (3s interval) 2. L2Parser extracts DEX txs (100ms per block) 3. Receipt parsing and event extraction (50-200ms) 4. Event validation and worker pool submission (10-50ms) 5. SwapAnalyzer processes event (50-200ms) 6. MultiHopScanner scans for paths (500ms-2s) 7. Opportunity ranking and filtering (100-200ms) 8. Executor validates and executes (1-5 seconds) ├─ Gas estimation ├─ Flash loan setup ├─ Simulation (if enabled) └─ Transaction broadcast 9. Confirmation monitoring ``` ### 4.2 Path 2: Event-Driven Cache Invalidation **Trigger**: Swap event on monitored pool **Impact**: Ensures fresh reserve data for profit calculations ``` 1. SwapAnalyzer detects swap event 2. Scanner.reserveCache.Invalidate(pool_address) 3. Next profit calculation fetches fresh reserves 4. Reduces stale data errors ``` ### 4.3 Path 3: Multi-Hop Path Discovery **Trigger**: Significant swap detected (>1% of pool liquidity) **Searches**: All connected token pairs ``` 1. Identify tokens from swap: WETH/USDC 2. For WETH: ├─ Find all WETH pairs: WETH/DAI, WETH/ARB, ... ├─ For each pair: │ ├─ Check for triangular arb: WETH → X → WETH │ └─ Check 3-hop paths: WETH → X → Y → WETH └─ Calculate profit for each path 3. For USDC: ├─ Same process as WETH └─ Find USDC triangular and multi-hop paths 4. Rank all paths by ROI 5. Execute top 3-5 most profitable paths ``` --- ## 5. ARCHITECTURAL PATTERNS ### 5.1 Worker Pool Pattern **Used in**: Scanner (concurrent event processing) ```go workerPool chan chan events.Event // 10 workers workers []*EventWorker // Worker array // Each worker: // 1. Registers in workerPool // 2. Waits for job on JobChannel // 3. Processes job synchronously // 4. Returns to pool ``` **Benefit**: Bounded concurrency (prevents resource exhaustion) ### 5.2 Pipeline Pattern **Used in**: Transaction monitoring → Event processing → Arbitrage detection Each stage: - Receives input from previous stage - Performs specific transformation - Sends output to next stage - Can fail gracefully without blocking others **Benefit**: Clear separation of concerns, easy to add/modify stages ### 5.3 Event-Driven Architecture **Used in**: Cache invalidation, opportunity notification ```go // Scanner notifies on state change event.Type == Swap → reserveCache.Invalidate(pool) // Detection engine notifies on opportunity opportunityHandler(opportunity) → Executor ``` **Benefit**: Decoupled components, reactive updates ### 5.4 Strategy Pattern **Used in**: Flash loan providers ```go interface FlashLoanProvider { ExecuteFlashLoan() GetMaxLoanAmount() GetFee() SupportsToken() } // Implementations: Aave, Uniswap, Balancer ``` **Benefit**: Easy to switch providers or add new ones --- ## 6. POTENTIAL ARCHITECTURAL CONCERNS ### 6.1 CRITICAL - Startup Hang Risk **Issue**: Pool discovery loop makes 190 RPC calls (O(n²) complexity) **Affected Function**: `startBot()` lines 324-416 **Status**: Currently DISABLED (commented out) **Evidence**: ``` Pool discovery loop: 20 tokens × 10 pairs each = 190 calls Known hang: WETH/GRT pair (0-9) consistently times out Impact: 5+ minute startup delay Current: Pool cache loaded with 314 existing pools Alternative: Pool discovery runs as background task post-startup ``` **Risk**: If re-enabled without fixes, will block startup indefinitely ### 6.2 Connection Stability Issues **Component**: RPC endpoint management **Risk Areas**: - Timeout handling for slow providers - Cascading failures when primary endpoint fails - Rate limit exhaustion **Mitigations in Place**: - ConnectionManager with retry logic (3 attempts) - Health check every 30 seconds - Automatic reconnection - Rate limiter with configurable RPS ### 6.3 Zero Address Data Corruption **Pattern Found**: Multiple validation points added ```go // Level 1: Scanner submission if event.PoolAddress == (common.Address{}) { return } // Reject // Level 2: Additional validation if event.PoolAddress == event.Token0 { return } // Reject duplicates // Level 3: Zero-padding detection if "0x0000000000000000" matches poolHex { return } // Reject suspicious ``` **Implication**: Zero addresses have occurred in production **Root Cause**: Incomplete calldata parsing or event log corruption **Status**: Multiple defensive filters in place ### 6.4 Concurrency Issues **Potential Race Conditions**: 1. **Scanner WaitGroup**: Fixed by processing synchronously in worker (line 129-131) ```go // FIXED: Process synchronously instead of spawning goroutine defer w.scanner.wg.Done() ``` 2. **Stat Updates**: Protected with atomics for int64, mutex for big.Int ```go atomic.AddInt64(&sas.stats.TotalOpportunitiesDetected, 1) sas.statsMutex.Lock() sas.stats.TotalProfitRealized.Add(...) sas.statsMutex.Unlock() ``` 3. **Token Cache**: Protected with RWMutex ```go sas.tokenCacheMutex.RLock() if cached, exists := sas.tokenCache[poolAddress] { ... } ``` ### 6.5 Memory Leaks & Resource Management **Areas of Concern**: 1. **Unbounded Channel Buffers**: - `transactionChannel` (50,000 buffer) - Could accumulate if processing slower than input - `opportunityChan` (1,000 buffer) - Configurable but fixed 2. **Opportunity Path Cache**: ```go opportunityPathCache map[string]*ArbitragePath // Grows unbounded - only cleared via deleteOpportunityPath() // Should have TTL or max size ``` 3. **Token Cache**: ```go tokenCache map[common.Address]TokenPair // Grows with each unique pool discovered // No eviction policy documented ``` **Recommended**: Implement cache eviction with LRU or TTL ### 6.6 Configuration Complexity **Issue**: Multiple overlapping configuration sources ``` Priority hierarchy: 1. Environment variables (highest) 2. YAML config values 3. Hardcoded defaults 4. Legacy mode handling Sources: - .env.production - config/arbitrum_production.yaml - Environment variable overrides - Endpoint configuration defaults ``` **Risk**: Configuration mistakes hard to diagnose ### 6.7 Limited Error Recovery **Patterns Found**: 1. **Fallback to Polling**: ```go // ArbitrumMonitor creation fails → Switch to fallback block polling monitor, err := sas.createArbitrumMonitor() if err != nil { sas.logger.Error("Failed to create monitor") sas.fallbackBlockPolling() return } ``` 2. **Incomplete Flash Loan Fallback**: - If flash loan fails, opportunity is lost - No fallback to standard swaps 3. **L2Parser Errors**: - Some DEX transactions silently fail to parse - No comprehensive error logging --- ## 7. PERFORMANCE CHARACTERISTICS ### 7.1 Throughput **Block Processing**: - Input: ~250-300 transactions per block - DEX transactions identified: ~5-20 per block - Processing time: 100-500ms per block - Effective TPS handled: 250-300 tx/block ÷ 3s = 83-100 tx/s monitored **Event Processing**: - Worker pool: 10 concurrent workers - Average event processing: 50-200ms - Throughput: ~50 events/sec - Queue depth: 50,000 transactions buffered **Opportunity Detection**: - Scan time: 500ms-2s per token pair - Max concurrent scans: 5 - Detection latency: 3-5 seconds from swap to detection **Execution**: - Validation: 100-200ms - Simulation (optional): 500ms-2s - Broadcasting: 100-500ms - Total: 1-5 seconds per opportunity ### 7.2 Resource Usage **Memory**: - Token cache: ~100 pools × 100 bytes = 10KB - Opportunity cache: ~100 opportunities × 1KB = 100KB - Reserve cache: Configurable, ~1-10MB typical **Network**: - Block polling: 1 RPC call per 3 seconds - Event receipts: ~5-20 calls per block = ~1-7 calls/second - Pool state queries: Variable, ~5-50 calls/second - Total: ~50-100 RPC calls/second in active operation --- ## 8. DEPLOYMENT & CONFIGURATION ### 8.1 Environment Setup **Required Files**: ``` .env.production # Environment variables config/arbitrum_production.yaml # Main configuration config/providers.yaml # RPC endpoint configuration keystore/ # Private keys (encrypted) ``` **Required Variables**: ```bash GO_ENV=production ARBITRUM_RPC_ENDPOINT=https://... ARBITRUM_WS_ENDPOINT=wss://... MEV_BOT_ENCRYPTION_KEY=... ETHEREUM_PRIVATE_KEY=... CONTRACT_ARBITRAGE_EXECUTOR=0x... CONTRACT_FLASH_SWAPPER=0x... ``` ### 8.2 Startup Sequence ``` 1. Load environment variables (.env.production) 2. Load YAML configuration 3. Validate RPC endpoints (security check) 4. Initialize logger 5. Initialize provider manager (50,000 TPS capacity) 6. Initialize security manager (optional) 7. Load pool discovery cache (314 pools) 8. ⚠️ SKIP pool discovery loop (prevents startup hang) 9. Initialize arbitrage service 10. Initialize dashboard server (port 8080) 11. Start monitoring goroutines 12. Enter main event loop ``` --- ## 9. RECOMMENDATIONS FOR IMPROVEMENT ### 9.1 Immediate Actions 1. **Add Cache Eviction** to token cache and opportunity cache: ```go // Implement TTL-based eviction type CachedToken struct { Data TokenPair ExpiresAt time.Time } ``` 2. **Document Pool Discovery**: If re-enabling, add: - Progress reporting (1/190 pairs complete) - Timeout per token pair (5 seconds) - Configurable maximum pairs to discover 3. **Improve Error Recovery**: - Retry flash loan failures with different providers - Log all DEX parsing failures to separate file - Implement circuit breaker for consistently failing pools ### 9.2 Medium-Term Improvements 1. **Refactor Configuration**: - Single source of truth for defaults - Configuration validation schema - Clear inheritance hierarchy 2. **Enhance Monitoring**: - Add Prometheus metrics export - Real-time dashboard with WebSocket updates - Alert thresholds for error rates 3. **Optimize Memory**: - Implement generic cache with configurable eviction - Add memory usage monitoring - Profile long-running operations ### 9.3 Long-Term Enhancements 1. **MEV Protection**: - Private mempool support - Flashbots integration - MEV-resistant path selection 2. **Advanced Scanning**: - Machine learning for opportunity prediction - Real-time liquidity tracking - Cross-chain opportunity detection 3. **Scalability**: - Horizontal scaling with state synchronization - Database abstraction for persistence - Distributed task processing --- ## 10. CONCLUSION The MEV bot implements a **well-structured, modular architecture** with clear separation between: - **Detection** (monitoring, event parsing, opportunity scanning) - **Execution** (validation, flash loans, transaction submission) - **Infrastructure** (configuration, logging, metrics) **Strengths**: - Comprehensive arbitrage detection with multi-hop support - Robust error handling with fallback mechanisms - Event-driven cache invalidation - Multiple validation layers against data corruption **Vulnerabilities**: - Potential startup hangs from disabled pool discovery - Unbounded cache growth - Complex configuration hierarchy - Limited flash loan fallback strategies **Overall Assessment**: **Production-ready** with the startup hang issue documented and disabled, suitable for active MEV extraction on Arbitrum with ongoing monitoring for edge cases and optimizations.