25 KiB
MEV Bot Codebase Architecture Analysis
Executive Summary
The MEV (Maximal Extractable Value) bot is a sophisticated arbitrage detection and execution system built in Go for the Arbitrum blockchain. It implements a modular, multi-layered architecture with clear separation of concerns across transaction monitoring, opportunity detection, validation, and execution components.
Current Status: The system is operational with comprehensive MEV detection capabilities, supporting multiple DEX protocols with advanced mathematical optimization and real-time market monitoring.
1. CORE WORKFLOW - Transaction to Execution Flow
High-Level Execution Path
┌─────────────────────────────────────────────────────────────────┐
│ 1. ARBITRUM SEQUENCER MONITORING (pkg/monitor/concurrent.go) │
│ └─ Real-time block polling (3-second intervals) │
│ └─ L2 transaction parsing via ArbitrumL2Parser │
│ └─ DEX transaction detection (function signature matching) │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 2. EVENT EXTRACTION & ENRICHMENT │
│ └─ Transaction receipt parsing │
│ └─ DEX event signature detection (Swap, Mint, Burn) │
│ └─ Token address extraction from calldata │
│ └─ Event validation (zero-address check) │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 3. SCANNER SUBMISSION & ROUTING (pkg/scanner/concurrent.go) │
│ └─ Worker pool distribution (10 concurrent workers) │
│ └─ Pool address validation │
│ └─ Duplicate address detection │
│ └─ Suspicious address filtering │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 4. MARKET ANALYSIS (pkg/scanner/market/scanner.go) │
│ └─ Swap event analysis │
│ └─ Liquidity event analysis │
│ └─ Pool state tracking │
│ └─ Reserve cache invalidation (event-driven) │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 5. ARBITRAGE DETECTION (pkg/arbitrage/service.go) │
│ └─ Multi-hop path scanning │
│ └─ Opportunity ranking and filtering │
│ └─ Profit calculation and ROI assessment │
│ └─ Confidence scoring │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 6. VALIDATION LAYER (pkg/validation/*) │
│ └─ Price impact validation │
│ └─ Gas cost estimation │
│ └─ Slippage tolerance checking │
│ └─ Profitability verification │
└────────────────────────┬────────────────────────────────────────┘
│
┌────────────────────────▼────────────────────────────────────────┐
│ 7. EXECUTION ENGINE (pkg/execution/executor.go) │
│ └─ Flash loan coordination (Aave/Uniswap/Balancer) │
│ └─ Transaction simulation (if enabled) │
│ └─ On-chain execution │
│ └─ Result tracking and metrics │
└─────────────────────────────────────────────────────────────────┘
2. KEY COMPONENTS & RESPONSIBILITIES
2.1 Transaction Monitoring Layer
Component: ArbitrumMonitor (pkg/monitor/concurrent.go)
Responsibilities:
- Establishes persistent connection to Arbitrum sequencer
- Polls latest blocks every 3 seconds
- Parses DEX transactions using ArbitrumL2Parser
- Extracts transaction calldata and function signatures
- Subscribes to DEX contract events in real-time
Critical Features:
- Connection health checker (30-second intervals)
- Automatic RPC failover via ConnectionManager
- Rate limiting (configurable RPS/burst)
- Transaction receipt parsing for swap event extraction
Data Structure:
type DEXTransaction struct {
Hash string
From string
To string
ContractName string
FunctionName string
FunctionSig string
Protocol string
InputData []byte
BlockNumber uint64
}
2.2 Event Parsing & Extraction
Component: EventParser (pkg/events/event_parser.go)
Responsibilities:
- Parses transaction receipts into typed events
- Extracts token addresses from transaction calldata
- Validates event data integrity
- Creates enriched event objects with token metadata
Critical Methods:
ParseTransactionReceipt()- Extracts events from receipt logsParseTransactionReceiptWithTx()- Enhanced parsing with calldata analysis- Token extraction via contract calls (token0/token1 functions)
Supported Events:
- Swap (Uniswap V2/V3, SushiSwap, Curve, etc.)
- AddLiquidity (mint events)
- RemoveLiquidity (burn events)
- NewPool
2.3 Market Scanner & Event Processing
Component: Scanner (pkg/scanner/concurrent.go)
Responsibilities:
- Manages worker pool for concurrent event processing
- Validates event data before processing
- Routes events to appropriate analyzers
- Implements cache invalidation strategy
Architecture:
Worker Pool (10 workers)
├─ Worker 0: [Job Channel] ──→ Process Event
├─ Worker 1: [Job Channel] ──→ Process Event
├─ Worker 2: [Job Channel] ──→ Process Event
└─ ...Worker N
Validation Gates:
- Pool Address Check: Rejects zero addresses
- Duplicate Address Detection: Pool ≠ Token0 ≠ Token1
- Zero-Padding Detection: Rejects suspicious addresses (0x0000...xxxx)
- Event Type Routing: Directs to swap/liquidity analyzers
Event-Driven Cache Invalidation:
- Swap events invalidate reserve caches for affected pools
- Liquidity changes trigger full pool state refresh
- Ensures profit calculations use fresh data
2.4 Arbitrage Detection Engine
Component: ArbitrageDetectionEngine (pkg/arbitrage/detection_engine.go)
Responsibilities:
- Continuously scans for profitable paths
- Implements multi-hop path analysis (up to 3 hops)
- Filters opportunities by profit threshold and ROI
- Provides real-time opportunity feed to execution
Detection Algorithm:
1. Input: Token pair + Swap event
2. For each token in pair:
├─ Determine scan amount (10% of swap size)
├─ Use MultiHopScanner to find all paths
├─ Calculate profit for each path
└─ Filter by: MinProfit, MinROI, MaxSlippage
3. Rank opportunities by: ROI, Profit, Urgency
4. Return top N opportunities
Configuration:
DetectionConfig struct {
ScanInterval time.Duration // 5 seconds default
MaxConcurrentScans int // 5 parallel scans
MaxConcurrentPaths int // 10 parallel path checks
MinProfitThreshold *UniversalDecimal // 0.001 ETH default
MaxPriceImpact *UniversalDecimal // 2% default
MaxHops int // 3 default
CacheTTL time.Duration // 5 minutes default
}
2.5 Arbitrage Service - Multi-Hop Scanning
Component: ArbitrageService (pkg/arbitrage/service.go)
Responsibilities:
- Main orchestration layer combining all components
- Executes multi-hop scanner for path discovery
- Manages opportunity execution pipeline
- Tracks statistics and metrics
Key Methods:
Start()- Initializes background monitoring goroutinesProcessSwapEvent()- Entry point for swap event processingExecuteOpportunity()- Dispatches opportunity to executorGetStats()- Returns real-time statistics
Multi-Hop Scanner Behavior:
Input: Significant Swap Event (e.g., WETH/USDC)
└─ Triggers path scanning for both WETH and USDC
For each token:
├─ Scan 1: Token → X → Token (triangular arbitrage)
├─ Scan 2: Token → X → Y → Token (3-hop arbitrage)
└─ Scan 3: Token → X → Y → Z → Token (4-hop if enabled)
Output: List of profitable paths ranked by ROI
2.6 Execution Layer
Component: ArbitrageExecutor (pkg/arbitrage/executor.go)
Responsibilities:
- Validates opportunities before execution
- Manages gas estimation and cost calculation
- Coordinates flash loan providers
- Submits transactions on-chain
- Tracks execution results
Execution Modes:
const (
SimulationMode // Test on fork
DryRunMode // Validate without sending
LiveMode // Real on-chain execution
)
Flash Loan Support:
- Aave Flash Loans
- Uniswap V3 Flash Swaps
- Balancer Flash Loans
3. DATA FLOW ANALYSIS
3.1 Block to Opportunity Flow
Block N
│
├─→ ArbitrumMonitor.processBlock()
│ ├─ Fetch block via L2Parser
│ ├─ Parse DEX transactions (filter by function sigs)
│ └─ Extract 5-20 DEX transactions per block
│
├─→ For each DEX transaction:
│ ├─ Fetch transaction receipt
│ ├─ EventParser.ParseTransactionReceipt()
│ │ ├─ Detect swap event signatures
│ │ ├─ Extract token addresses
│ │ └─ Create Event objects
│ │
│ └─ Scanner.SubmitEvent() → Event Validation
│ ├─ Check zero addresses
│ ├─ Check duplicate addresses
│ ├─ Check zero-padding
│ └─ Enqueue to worker pool
│
├─→ EventWorker.Process() [10 concurrent]
│ ├─ Invalidate reserve cache
│ └─ Route to SwapAnalyzer
│
├─→ SwapAnalyzer.AnalyzeSwapEvent()
│ ├─ Extract swap amounts and tokens
│ └─ Trigger arbitrage detection
│
└─→ ArbitrageService.DetectArbitrageOpportunities()
├─ ScanForArbitrage(token, amount)
├─ Find profitable paths
└─ Execute top opportunities
3.2 Configuration Dependency Flow
cmd/mev-bot/main.go
│
├─→ Load Configuration (YAML + Environment Variables)
│ ├─ config/arbitrum_production.yaml
│ ├─ .env.production
│ └─ Environment variable overrides
│
├─→ Validate RPC Endpoints
│
├─→ Initialize Components:
│ ├─ Provider Manager (unified RPC management)
│ ├─ Security Manager (key management, rate limiting)
│ ├─ Key Manager (transaction signing)
│ ├─ Pool Discovery (cache-based pool metadata)
│ └─ Token Metadata Cache (token information)
│
├─→ Create ArbitrageService
│ ├─ Initialize MultiHopScanner
│ ├─ Initialize ArbitrageExecutor
│ ├─ Initialize DetectionEngine
│ ├─ Initialize FlashSwapExecutor
│ └─ Initialize LiveExecutionFramework
│
├─→ Start Monitoring Components:
│ ├─ ArbitrumMonitor (blockchain reading)
│ ├─ Stats Updater (30s metrics)
│ ├─ Market Data Syncer (10s sync)
│ └─ Dashboard Server (port 8080)
│
└─→ Main Event Loop (Wait for signals)
4. CRITICAL EXECUTION PATHS
4.1 Path 1: Transaction Detection → Execution
Trigger: New block with DEX transaction Duration: 3-30 seconds (block to execution)
1. ArbitrumMonitor polls new block (3s interval)
2. L2Parser extracts DEX txs (100ms per block)
3. Receipt parsing and event extraction (50-200ms)
4. Event validation and worker pool submission (10-50ms)
5. SwapAnalyzer processes event (50-200ms)
6. MultiHopScanner scans for paths (500ms-2s)
7. Opportunity ranking and filtering (100-200ms)
8. Executor validates and executes (1-5 seconds)
├─ Gas estimation
├─ Flash loan setup
├─ Simulation (if enabled)
└─ Transaction broadcast
9. Confirmation monitoring
4.2 Path 2: Event-Driven Cache Invalidation
Trigger: Swap event on monitored pool Impact: Ensures fresh reserve data for profit calculations
1. SwapAnalyzer detects swap event
2. Scanner.reserveCache.Invalidate(pool_address)
3. Next profit calculation fetches fresh reserves
4. Reduces stale data errors
4.3 Path 3: Multi-Hop Path Discovery
Trigger: Significant swap detected (>1% of pool liquidity) Searches: All connected token pairs
1. Identify tokens from swap: WETH/USDC
2. For WETH:
├─ Find all WETH pairs: WETH/DAI, WETH/ARB, ...
├─ For each pair:
│ ├─ Check for triangular arb: WETH → X → WETH
│ └─ Check 3-hop paths: WETH → X → Y → WETH
└─ Calculate profit for each path
3. For USDC:
├─ Same process as WETH
└─ Find USDC triangular and multi-hop paths
4. Rank all paths by ROI
5. Execute top 3-5 most profitable paths
5. ARCHITECTURAL PATTERNS
5.1 Worker Pool Pattern
Used in: Scanner (concurrent event processing)
workerPool chan chan events.Event // 10 workers
workers []*EventWorker // Worker array
// Each worker:
// 1. Registers in workerPool
// 2. Waits for job on JobChannel
// 3. Processes job synchronously
// 4. Returns to pool
Benefit: Bounded concurrency (prevents resource exhaustion)
5.2 Pipeline Pattern
Used in: Transaction monitoring → Event processing → Arbitrage detection
Each stage:
- Receives input from previous stage
- Performs specific transformation
- Sends output to next stage
- Can fail gracefully without blocking others
Benefit: Clear separation of concerns, easy to add/modify stages
5.3 Event-Driven Architecture
Used in: Cache invalidation, opportunity notification
// Scanner notifies on state change
event.Type == Swap → reserveCache.Invalidate(pool)
// Detection engine notifies on opportunity
opportunityHandler(opportunity) → Executor
Benefit: Decoupled components, reactive updates
5.4 Strategy Pattern
Used in: Flash loan providers
interface FlashLoanProvider {
ExecuteFlashLoan()
GetMaxLoanAmount()
GetFee()
SupportsToken()
}
// Implementations: Aave, Uniswap, Balancer
Benefit: Easy to switch providers or add new ones
6. POTENTIAL ARCHITECTURAL CONCERNS
6.1 CRITICAL - Startup Hang Risk
Issue: Pool discovery loop makes 190 RPC calls (O(n²) complexity)
Affected Function: startBot() lines 324-416
Status: Currently DISABLED (commented out)
Evidence:
Pool discovery loop: 20 tokens × 10 pairs each = 190 calls
Known hang: WETH/GRT pair (0-9) consistently times out
Impact: 5+ minute startup delay
Current: Pool cache loaded with 314 existing pools
Alternative: Pool discovery runs as background task post-startup
Risk: If re-enabled without fixes, will block startup indefinitely
6.2 Connection Stability Issues
Component: RPC endpoint management Risk Areas:
- Timeout handling for slow providers
- Cascading failures when primary endpoint fails
- Rate limit exhaustion
Mitigations in Place:
- ConnectionManager with retry logic (3 attempts)
- Health check every 30 seconds
- Automatic reconnection
- Rate limiter with configurable RPS
6.3 Zero Address Data Corruption
Pattern Found: Multiple validation points added
// Level 1: Scanner submission
if event.PoolAddress == (common.Address{}) { return } // Reject
// Level 2: Additional validation
if event.PoolAddress == event.Token0 { return } // Reject duplicates
// Level 3: Zero-padding detection
if "0x0000000000000000" matches poolHex { return } // Reject suspicious
Implication: Zero addresses have occurred in production Root Cause: Incomplete calldata parsing or event log corruption Status: Multiple defensive filters in place
6.4 Concurrency Issues
Potential Race Conditions:
-
Scanner WaitGroup: Fixed by processing synchronously in worker (line 129-131)
// FIXED: Process synchronously instead of spawning goroutine defer w.scanner.wg.Done() -
Stat Updates: Protected with atomics for int64, mutex for big.Int
atomic.AddInt64(&sas.stats.TotalOpportunitiesDetected, 1) sas.statsMutex.Lock() sas.stats.TotalProfitRealized.Add(...) sas.statsMutex.Unlock() -
Token Cache: Protected with RWMutex
sas.tokenCacheMutex.RLock() if cached, exists := sas.tokenCache[poolAddress] { ... }
6.5 Memory Leaks & Resource Management
Areas of Concern:
-
Unbounded Channel Buffers:
transactionChannel(50,000 buffer) - Could accumulate if processing slower than inputopportunityChan(1,000 buffer) - Configurable but fixed
-
Opportunity Path Cache:
opportunityPathCache map[string]*ArbitragePath // Grows unbounded - only cleared via deleteOpportunityPath() // Should have TTL or max size -
Token Cache:
tokenCache map[common.Address]TokenPair // Grows with each unique pool discovered // No eviction policy documented
Recommended: Implement cache eviction with LRU or TTL
6.6 Configuration Complexity
Issue: Multiple overlapping configuration sources
Priority hierarchy:
1. Environment variables (highest)
2. YAML config values
3. Hardcoded defaults
4. Legacy mode handling
Sources:
- .env.production
- config/arbitrum_production.yaml
- Environment variable overrides
- Endpoint configuration defaults
Risk: Configuration mistakes hard to diagnose
6.7 Limited Error Recovery
Patterns Found:
-
Fallback to Polling:
// ArbitrumMonitor creation fails → Switch to fallback block polling monitor, err := sas.createArbitrumMonitor() if err != nil { sas.logger.Error("Failed to create monitor") sas.fallbackBlockPolling() return } -
Incomplete Flash Loan Fallback:
- If flash loan fails, opportunity is lost
- No fallback to standard swaps
-
L2Parser Errors:
- Some DEX transactions silently fail to parse
- No comprehensive error logging
7. PERFORMANCE CHARACTERISTICS
7.1 Throughput
Block Processing:
- Input: ~250-300 transactions per block
- DEX transactions identified: ~5-20 per block
- Processing time: 100-500ms per block
- Effective TPS handled: 250-300 tx/block ÷ 3s = 83-100 tx/s monitored
Event Processing:
- Worker pool: 10 concurrent workers
- Average event processing: 50-200ms
- Throughput: ~50 events/sec
- Queue depth: 50,000 transactions buffered
Opportunity Detection:
- Scan time: 500ms-2s per token pair
- Max concurrent scans: 5
- Detection latency: 3-5 seconds from swap to detection
Execution:
- Validation: 100-200ms
- Simulation (optional): 500ms-2s
- Broadcasting: 100-500ms
- Total: 1-5 seconds per opportunity
7.2 Resource Usage
Memory:
- Token cache: ~100 pools × 100 bytes = 10KB
- Opportunity cache: ~100 opportunities × 1KB = 100KB
- Reserve cache: Configurable, ~1-10MB typical
Network:
- Block polling: 1 RPC call per 3 seconds
- Event receipts: ~5-20 calls per block = ~1-7 calls/second
- Pool state queries: Variable, ~5-50 calls/second
- Total: ~50-100 RPC calls/second in active operation
8. DEPLOYMENT & CONFIGURATION
8.1 Environment Setup
Required Files:
.env.production # Environment variables
config/arbitrum_production.yaml # Main configuration
config/providers.yaml # RPC endpoint configuration
keystore/ # Private keys (encrypted)
Required Variables:
GO_ENV=production
ARBITRUM_RPC_ENDPOINT=https://...
ARBITRUM_WS_ENDPOINT=wss://...
MEV_BOT_ENCRYPTION_KEY=...
ETHEREUM_PRIVATE_KEY=...
CONTRACT_ARBITRAGE_EXECUTOR=0x...
CONTRACT_FLASH_SWAPPER=0x...
8.2 Startup Sequence
1. Load environment variables (.env.production)
2. Load YAML configuration
3. Validate RPC endpoints (security check)
4. Initialize logger
5. Initialize provider manager (50,000 TPS capacity)
6. Initialize security manager (optional)
7. Load pool discovery cache (314 pools)
8. ⚠️ SKIP pool discovery loop (prevents startup hang)
9. Initialize arbitrage service
10. Initialize dashboard server (port 8080)
11. Start monitoring goroutines
12. Enter main event loop
9. RECOMMENDATIONS FOR IMPROVEMENT
9.1 Immediate Actions
-
Add Cache Eviction to token cache and opportunity cache:
// Implement TTL-based eviction type CachedToken struct { Data TokenPair ExpiresAt time.Time } -
Document Pool Discovery: If re-enabling, add:
- Progress reporting (1/190 pairs complete)
- Timeout per token pair (5 seconds)
- Configurable maximum pairs to discover
-
Improve Error Recovery:
- Retry flash loan failures with different providers
- Log all DEX parsing failures to separate file
- Implement circuit breaker for consistently failing pools
9.2 Medium-Term Improvements
-
Refactor Configuration:
- Single source of truth for defaults
- Configuration validation schema
- Clear inheritance hierarchy
-
Enhance Monitoring:
- Add Prometheus metrics export
- Real-time dashboard with WebSocket updates
- Alert thresholds for error rates
-
Optimize Memory:
- Implement generic cache with configurable eviction
- Add memory usage monitoring
- Profile long-running operations
9.3 Long-Term Enhancements
-
MEV Protection:
- Private mempool support
- Flashbots integration
- MEV-resistant path selection
-
Advanced Scanning:
- Machine learning for opportunity prediction
- Real-time liquidity tracking
- Cross-chain opportunity detection
-
Scalability:
- Horizontal scaling with state synchronization
- Database abstraction for persistence
- Distributed task processing
10. CONCLUSION
The MEV bot implements a well-structured, modular architecture with clear separation between:
- Detection (monitoring, event parsing, opportunity scanning)
- Execution (validation, flash loans, transaction submission)
- Infrastructure (configuration, logging, metrics)
Strengths:
- Comprehensive arbitrage detection with multi-hop support
- Robust error handling with fallback mechanisms
- Event-driven cache invalidation
- Multiple validation layers against data corruption
Vulnerabilities:
- Potential startup hangs from disabled pool discovery
- Unbounded cache growth
- Complex configuration hierarchy
- Limited flash loan fallback strategies
Overall Assessment: Production-ready with the startup hang issue documented and disabled, suitable for active MEV extraction on Arbitrum with ongoing monitoring for edge cases and optimizations.