# MEV Bot Project Specification **Version:** 2.0 (November 2025 - Current State) **Language:** Go 1.24+ **Target Chain:** Arbitrum (Layer 2) **Module:** github.com/fraktal/mev-beta **Codebase:** 362 Go files, ~100,000+ lines of code --- ## 🎯 Project Overview The MEV Bot is a **production-grade arbitrage detection and analysis system** for the Arbitrum network. It monitors decentralized exchanges (DEXs) in real-time using an event-driven architecture to identify profitable arbitrage opportunities across multiple protocols. ### Core Capabilities - **Real-time Arbitrum Monitoring** with sub-second latency via event-driven processing - **Multi-Protocol Support** for Uniswap V2/V3, SushiSwap, Curve, Balancer, and more - **Advanced Transaction Parsing** with sophisticated ABI decoding for complex multicalls - **Three-Pool RPC Architecture** separating read-only, execution, and testing workloads - **Worker Pool Processing** for concurrent event analysis (100+ events/sec capacity) - **Secure Key Management** with AES-256-GCM encryption and hardware wallet support - **Production Logging System** with health scoring, analytics, and automated archival --- ## 🏗️ System Architecture ### Layered Architecture (5 Layers) ``` ┌─────────────────────────────────────────────────────────────┐ │ Layer 1: Smart Contract Layer │ │ - Arbitrage executor contracts (bindings/) │ │ - Flash swap executors │ │ - Token and pool interfaces │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Layer 2: Execution Layer │ │ - ArbitrageExecutor (pkg/arbitrage/executor.go - 1,641 LOC)│ │ - FlashSwapExecutor (pkg/arbitrage/flash_executor.go) │ │ - LiveExecutionFramework (real-time execution) │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Layer 3: Detection & Analysis Layer │ │ - ArbitrageDetectionEngine (opportunity discovery) │ │ - MultiHopScanner (multi-hop path finding) │ │ - Scanner with worker pools (event processing) │ │ - DEX protocol implementations (6 protocols) │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Layer 4: Event Collection & Parsing Layer │ │ - ArbitrumMonitor (sequencer monitoring - 1,351 LOC) │ │ - L2Parser (transaction parsing - 1,985 LOC) │ │ - AbiDecoder (multicall decoding - 1,116 LOC) │ │ - EventParser (log parsing - 1,806 LOC) │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ Layer 5: Infrastructure Layer │ │ - UnifiedProviderManager (3-pool RPC architecture) │ │ - PoolDiscovery (cache-based pool management) │ │ - KeyManager (secure signing - 1,841 LOC) │ │ - RateLimiter (per-endpoint limiting) │ └─────────────────────────────────────────────────────────────┘ ``` ### Three-Pool RPC Architecture The system uses **three separate RPC endpoint pools** for optimal performance: ``` UnifiedProviderManager ├─ ReadOnlyPool (50 RPS max) │ └─ Used for: getBalance, call, getLogs, getCode │ └─ High throughput for read-heavy operations │ ├─ ExecutionPool (20 RPS max) │ └─ Used for: sendTransaction │ └─ Reliable endpoints with lower limits │ └─ TestingPool (10 RPS max) └─ Used for: simulation, callStatic └─ Isolated from production workload ``` **Benefits:** - Execution transactions never rate-limited by read operations - Independent failover per pool - Optimized rate limits per endpoint capability - Health checks and automatic endpoint rotation --- ## 📊 Core Components ### 1. Arbitrage Service (`pkg/arbitrage/` - 17 files, 7,000+ LOC) **Primary Components:** - **ArbitrageService** (service.go - 1,995 LOC) - Main orchestration service - **ArbitrageExecutor** (executor.go - 1,641 LOC) - Transaction execution - **FlashSwapExecutor** (flash_executor.go - 1,462 LOC) - Flash swap logic - **MultiHopScanner** (multihop.go - 892 LOC) - Multi-hop path detection - **DetectionEngine** (detection_engine.go - 953 LOC) - Opportunity discovery - **LiveExecutionFramework** (1,005 LOC) - Real-time execution - **NonceManager** (3,843 LOC) - Transaction nonce management - **Database** (13,129 LOC) - Opportunity persistence **Key Features:** - Event-driven arbitrage detection - Multi-hop route optimization - Gas-aware profit calculation - Confidence scoring and risk assessment - Real-time opportunity ranking ### 2. Arbitrum Integration (`pkg/arbitrum/` - 34 files, 8,000+ LOC) **Primary Components:** - **L2Parser** (l2_parser.go - 1,985 LOC) - Advanced transaction parsing - **AbiDecoder** (abi_decoder.go - 1,116 LOC) - Multicall decoding - **Parser** (parser.go - 967 LOC) - Basic transaction parsing - **ConnectionManager** (connection.go - 266 LOC) - RPC management - **SwapPipeline** (swap_pipeline.go - 844 LOC) - Swap processing - **EventMonitor** (event_monitor.go - 658 LOC) - Event monitoring **Capabilities:** - Handles complex multicall transactions - Supports 10+ DEX router patterns - Extracts token addresses and swap amounts - ~90% parsing success rate on production data - Graceful fallback for unknown patterns ### 3. Market Monitoring (`pkg/monitor/` - 1,351 LOC) **ArbitrumMonitor:** - WebSocket subscription to Arbitrum sequencer - High-throughput transaction processing (50,000 buffer) - Automatic RPC failover and health monitoring - Rate limiting and connection management - Feeds parsed transactions to scanner **Performance:** - Processing: ~3-4 blocks/second sustained - Latency: Sub-second block processing - Uptime: 27+ minutes continuous (validated) ### 4. Scanner System (`pkg/scanner/` - 5 subdirectories) **Architecture:** ``` Scanner (concurrent.go) ├─ Worker Pool Pattern │ ├─ Configurable worker count (4-8 default) │ ├─ Non-blocking channel communication │ └─ Graceful shutdown with WaitGroup │ ├─ MarketScanner (market/) │ └─ Token pair and pool analysis │ ├─ SwapAnalyzer (swap/) │ └─ Swap event detection and analysis │ └─ LiquidityAnalyzer (analysis/) └─ Liquidity change calculations ``` **Performance:** - Throughput: 100+ events/second with 4-8 workers - Latency: ~10-50ms per event analysis - Concurrency: Independent worker processing ### 5. DEX Protocol Support (`pkg/dex/` - 11 files) | Protocol | Implementation | Fee Structure | Math Type | |----------|---------------|---------------|-----------| | Uniswap V3 | uniswap_v3.go | 0.05%-1% | Concentrated liquidity, tick-based | | Uniswap V2 | dex/ | 0.3% | Constant product (x×y=k) | | SushiSwap | sushiswap.go | 0.3% | V2-compatible | | Curve | curve.go | 0.04% | StableSwap invariant | | Balancer | balancer.go | 0.3% | Weighted pool formula | | 1inch | (referenced) | Variable | Aggregator support | **Protocol-Specific Features:** - V3: Tick-based price ranges with sqrt price math - V2: Classic AMM formula with fee deduction - Curve: Low-slippage stablecoin swaps - Balancer: Multi-token weighted pools ### 6. Security & Key Management (`pkg/security/` - 11 files, 5,000+ LOC) **Components:** - **KeyManager** (keymanager.go - 1,841 LOC) - Secure key generation, storage, signing - **RateLimiter** (rate_limiter.go - 1,411 LOC) - DoS protection - **AuditAnalyzer** (audit_analyzer.go - 1,646 LOC) - Audit logging - **PerformanceProfiler** (1,316 LOC) - Performance metrics - **AnomalyDetector** (1,069 LOC) - Suspicious activity detection **Security Features:** - AES-256-GCM encryption for private keys - Hardware wallet support - Automatic key rotation - Comprehensive audit logging - Rate limiting at multiple levels --- ## 🔄 Data Flow & Processing Pipeline ### Complete Processing Flow ``` 1. Arbitrum Block Stream (WebSocket) ↓ 2. ArbitrumMonitor.Start() - Subscribes to new blocks - Fetches block transactions ↓ 3. L2Parser.ParseTransaction() - Decodes multicall with AbiDecoder - Extracts function calls - Identifies swap operations ↓ 4. EventParser.ParseEvents() - Decodes transaction receipt logs - Extracts swap/liquidity events - Parses pool state changes ↓ 5. Scanner.ProcessEvents() - Dispatches to worker pool - MarketScanner analyzes token pairs - SwapAnalyzer detects arbitrage patterns - LiquidityAnalyzer calculates impacts ↓ 6. ArbitrageService monitors results - MultiHopScanner finds optimal paths - DetectionEngine ranks opportunities - Filters by confidence and profitability ↓ 7. ArbitrageExecutor.ExecuteArbitrage() - Simulates transaction - Estimates gas costs - Validates profitability - Signs with KeyManager - Submits to Arbitrum ↓ 8. Results logged and persisted ``` ### Performance Characteristics **Latency Breakdown (Block → Detection):** ``` 1. Receive block: ~1ms 2. Fetch transaction: ~50-100ms (RPC) 3. Fetch receipt: ~50-100ms (RPC) 4. Parse transaction (ABI): ~10-50ms (CPU) 5. Parse events: ~5-20ms (CPU) 6. Analyze events (scanner): ~10-50ms (CPU) 7. Detect arbitrage: ~20-100ms (CPU + RPC) ───────────────────────────────────────────── Total: ~150-450ms from block to detection ``` **Observation:** RPC calls dominate latency, not CPU processing. --- ## ⚙️ Configuration Management ### Configuration Hierarchy ``` 1. YAML Configuration Files (Base) ├─ config/arbitrum_production.yaml (tokens, DEX configs) ├─ config/providers.yaml (RPC endpoint pools) └─ config/providers_runtime.yaml (runtime overrides) 2. Environment Variables (Override) ├─ GO_ENV (development|staging|production) ├─ MEV_BOT_ENCRYPTION_KEY (required) ├─ ARBITRUM_RPC_ENDPOINT ├─ ARBITRUM_WS_ENDPOINT └─ LOG_LEVEL, DEBUG, METRICS_ENABLED 3. Runtime Configuration (Programmatic) ├─ Per-endpoint overrides └─ Dynamic endpoint switching ``` ### Production Configuration Example **config/arbitrum_production.yaml:** ```yaml tokens: weth: address: "0x82aF49447D8a07e3bd95BD0d56f35241523fBab1" decimals: 18 coingecko_id: "weth" usdc: address: "0xaf88d065e77c8cC2239327C5EDb3A432268e5831" decimals: 6 is_stable: true # 20+ major tokens defined dex_configs: uniswap_v3: factory: "0x1F98431c8aD98523631AE4a59f267346ea31F984" router: "0xE592427A0AEce92De3Edee1F18E0157C05861564" fee_tiers: [500, 3000, 10000] arbitrage: min_profit_threshold: "0.001" # 0.1% max_slippage: "0.005" # 0.5% max_gas_price: "50000000000" # 50 gwei max_position_size: "100000000000000000000" # 100 ETH ``` **config/providers.yaml:** ```yaml read_only_pool: endpoints: - url: "https://arbitrum-mainnet.core.chainstack.com/..." name: "chainstack-primary" priority: 1 max_rps: 50 timeout: "10s" - url: "https://arb1.arbitrum.io/rpc" name: "arbitrum-public" priority: 2 max_rps: 30 execution_pool: endpoints: - url: "https://arbitrum-mainnet.core.chainstack.com/..." priority: 1 max_rps: 20 testing_pool: endpoints: - url: "https://arbitrum-mainnet.core.chainstack.com/..." priority: 1 max_rps: 10 ``` --- ## 📈 Production Status & Performance ### Current Implementation Status **✅ Production Ready:** - Real-time transaction parsing (~90% success rate) - Event processing (100+ events/sec) - Multi-protocol support (6 DEX protocols) - Rate limiting and failover - Secure key management - Production logging with health scoring **⚠️ Partially Disabled (Workarounds Active):** - Pool discovery background task (uses cache-only, 314 pools loaded) - Security manager (KeyManager works independently) **❌ Not Implemented:** - MEV protection (Flashbots, MEV-Share) - Multi-chain support (Arbitrum only) - Persistent opportunity database - Machine learning-based detection ### Performance Metrics (Validated) | Metric | Value | Source | |--------|-------|--------| | Startup Time | ~30 seconds | With pool cache | | Event Processing | 100+ events/sec | Worker pool capacity | | Detection Latency | 150-450ms | Block to opportunity | | Memory Baseline | ~200MB | Pool cache + state | | Memory Peak | ~500MB | Full operation | | Health Score | 97.97/100 | Log analytics system | | Error Rate | 2.03% | Log analysis | | Parsing Success | ~90% | Transaction decoding | | Uptime | 27+ minutes | Validated continuous | ### System Requirements **Minimum:** - CPU: 2+ cores for concurrent processing - RAM: 4GB+ for transaction buffering - Network: Stable WebSocket connection - Storage: 10GB+ for logs **Recommended:** - CPU: 4+ cores for optimal worker pools - RAM: 8GB+ for larger pool cache - Network: Multiple RPC providers for redundancy - Storage: 50GB+ for long-term logging --- ## 🔬 Testing Infrastructure ### Test Organization ``` tests/ ├── integration/ │ ├── fork_test.go # Arbitrum fork testing │ └── [other tests] ├── cache/ # Cache-related tests ├── contracts/ # Contract interaction tests └── scenarios/ # Test scenarios pkg/**/..._test.go # Unit tests colocated with source ``` ### Test Coverage **Unit Tests:** - arbitrage/: flash_executor_test.go, multihop_test.go - arbitrum/: connection_test.go, parser_test.go, abi_fuzz_test.go - scanner/: concurrent_test.go - security/: keymanager_test.go - validation/: pool_validator_test.go (1,155 lines) **Integration Tests:** - End-to-end transaction processing - Multi-protocol detection accuracy - Cross-protocol arbitrage detection **Build & Test Commands:** ```bash make build # Compile binary make test # Run all tests make test-coverage # Generate coverage report make test-integration # Integration tests only make lint # Run golangci-lint make security-scan # Security analysis (gosec) ``` --- ## 🚀 Deployment Guide ### Prerequisites ```bash # 1. Go 1.24 or later go version # 2. Create encryption key (32 bytes) openssl rand -hex 16 > .env.encryption_key # 3. Setup keystore mkdir -p keystore chmod 700 keystore # 4. Configure environment export GO_ENV=production export MEV_BOT_ENCRYPTION_KEY=$(cat .env.encryption_key) export ARBITRUM_RPC_ENDPOINT="wss://your-endpoint" ``` ### Quick Start ```bash # Build the binary make build # Run the bot ./bin/mev-bot start # Or with explicit config GO_ENV=production ./bin/mev-bot start ``` ### Production Deployment **1. Configuration:** ```bash # Copy and customize production configs cp config/arbitrum_production.yaml config/arbitrum_production.yaml.local cp config/providers.yaml config/providers.yaml.local # Edit with actual RPC endpoints and API keys vim config/arbitrum_production.yaml.local vim config/providers.yaml.local ``` **2. Environment Setup:** ```bash # Create .env.production file cat > .env.production < MEV_BOT_KEYSTORE_PATH=keystore ARBITRUM_RPC_ENDPOINT=wss://... ARBITRUM_WS_ENDPOINT=wss://... LOG_LEVEL=info METRICS_ENABLED=true EOF ``` **3. Start Service:** ```bash # Load environment and start source .env.production ./bin/mev-bot start ``` ### Monitoring & Health Checks **Production Logging System:** ``` logs/ ├── mev_bot.log # Main application log ├── mev_bot_errors.log # Error-specific log ├── mev_bot_performance.log # Performance metrics ├── analytics/ # Real-time analysis │ ├── analysis_*.json # Comprehensive metrics │ └── dashboard_*.html # Operations dashboard ├── health/ # Health monitoring │ └── health_*.json # Health reports (97.97/100) ├── archives/ # Compressed rotated logs └── rotated/ # Rotated log files ``` **Health Check Commands:** ```bash # Real-time analysis with health scoring ./scripts/log-manager.sh analyze # Check system health ./scripts/log-manager.sh health # Full management cycle ./scripts/log-manager.sh full # Start background monitoring daemon ./scripts/log-manager.sh start-daemon ``` **Alert Thresholds:** - Error rate > 10% = Critical - Health score < 80 = Warning - Zero opportunities detected for >1 hour = Investigation needed - Memory usage > 750MB = Pool pruning required --- ## 🔒 Security Considerations ### Production Security **Key Management:** - AES-256-GCM encryption for all private keys - Secure key derivation from master password - Automatic key rotation support - Hardware wallet integration ready **Input Validation:** - All external data validated before processing - Token address validation (checksum, zero-address checks) - Amount bounds checking (overflow protection) - Gas price limits (max 50 gwei default) **Rate Limiting:** - Per-endpoint rate limits (configurable) - Global transaction rate limiting - Burst allowances for spike handling - Automatic backoff on 429 responses - Circuit breakers on repeated failures ### Risk Management **Execution Safeguards:** - Configurable slippage protection (0.5% default max) - Maximum transaction value limits (100 ETH default) - Profit validation after gas costs - Simulation before actual execution - Confidence scoring (0.0-1.0 scale) **Error Handling:** - Comprehensive error handling at all layers - Automatic retry with exponential backoff - Fallback RPC providers - Graceful degradation on failures --- ## 📝 Known Limitations & Future Enhancements ### Current Limitations **1. Pool Discovery:** - Background discovery disabled (prevents startup hang) - Relies on cached pool data (314 pools) - No automatic new pool detection - **Workaround:** Manual cache updates or restart **2. Security Manager:** - Comprehensive security manager disabled for debugging - KeyManager works independently - Missing some advanced security features **3. MEV Protection:** - No Flashbots integration - No MEV-Share participation - Transactions visible on public mempool - Vulnerable to sandwich attacks **4. Single Chain:** - Arbitrum only (no Ethereum, Optimism, Base, etc.) - No cross-chain arbitrage - No bridge monitoring **5. In-Memory State:** - No persistent opportunity database - Restarts lose historical context - Limited long-term analytics ### Planned Enhancements **High Priority:** - [ ] Re-enable pool discovery (fix hang issue) - [ ] Re-enable security manager (identify and fix cause) - [ ] Add persistent PostgreSQL database - [ ] Implement MEV protection (Flashbots) - [ ] Add Prometheus metrics export **Medium Priority:** - [ ] Multi-chain support (Optimism, Base) - [ ] Flash loan integration (capital-free arbitrage) - [ ] Machine learning opportunity prediction - [ ] Advanced gas optimization - [ ] WebSocket dashboard **Low Priority:** - [ ] MEV-Share integration - [ ] Cross-chain bridge monitoring - [ ] Collaborative MEV strategies - [ ] Historical replay capability --- ## 📚 Documentation ### Documentation Structure ``` docs/ ├── CODEBASE_EXPLORATION_COMPLETE.md # Complete codebase analysis ├── IMPLEMENTATION_INSIGHTS.md # What code actually does ├── CODEBASE_QUICK_REFERENCE.md # Quick reference guide ├── CODEBASE_EXPLORATION_INDEX.md # Navigation index ├── DEVELOPER_DOCS.md # Developer documentation ├── MONITORING_GUIDE.md # Monitoring and operations ├── QUICK_START.md # Quick start guide └── [100+ additional docs] # Historical and specialized docs ``` ### Key Documentation Files **For New Developers:** 1. CODEBASE_QUICK_REFERENCE.md - Start here 2. CODEBASE_EXPLORATION_COMPLETE.md - Deep dive 3. DEVELOPER_DOCS.md - Development guidelines **For Operations:** 1. MONITORING_GUIDE.md - Production monitoring 2. Log manager scripts (scripts/log-manager.sh) 3. Health check procedures **For Understanding Architecture:** 1. IMPLEMENTATION_INSIGHTS.md - Reality vs documentation 2. CODEBASE_EXPLORATION_INDEX.md - Component navigation 3. This specification (PROJECT_SPECIFICATION.md) --- ## 🎯 Getting Started ### For Developers **1. Understand the codebase:** ```bash # Read these in order: cat docs/CODEBASE_QUICK_REFERENCE.md cat docs/IMPLEMENTATION_INSIGHTS.md cat docs/CODEBASE_EXPLORATION_COMPLETE.md ``` **2. Build and test:** ```bash make build make test ``` **3. Run in development:** ```bash export GO_ENV=development export MEV_BOT_ENCRYPTION_KEY=$(openssl rand -hex 16) ./bin/mev-bot start ``` ### For Operations **1. Deploy to production:** ```bash # Follow deployment guide above source .env.production ./bin/mev-bot start ``` **2. Monitor health:** ```bash # Check health score (target: >95) ./scripts/log-manager.sh health # Real-time monitoring ./scripts/log-manager.sh start-daemon ``` **3. Troubleshoot issues:** ```bash # Analyze logs ./scripts/log-manager.sh analyze # View latest errors tail -100 logs/mev_bot_errors.log # Check specific issues in main.go debug checkpoints (20 total) grep "CHECKPOINT" logs/mev_bot.log ``` --- ## 📊 Performance Expectations ### MEV Profit Expectations (Arbitrum Realistic) **Based on current market conditions:** - **Arbitrage Frequency:** 5-20 opportunities per day (market dependent) - **Profit per Trade:** 0.1-0.5% typical ($2-$10 on $1,000 capital) - **Daily Target:** $10-$200 with moderate capital and optimal conditions - **Time to First Detection:** ~30 seconds from startup - **Time to First Opportunity:** 30-60 minutes (market dependent) **Note:** These are detection rates. Actual execution profits depend on: - Gas costs (50-150k gas per execution) - Slippage during execution - Competition from other MEV bots - Market volatility --- ## 🔗 External Dependencies ### Go Module Dependencies **Primary:** - github.com/ethereum/go-ethereum v1.16.3 (Ethereum client library) - github.com/gorilla/websocket v1.5.3 (WebSocket support) - github.com/holiman/uint256 v1.3.2 (256-bit integers) - github.com/urfave/cli/v2 v2.27.5 (CLI framework) - gopkg.in/yaml.v3 (YAML parsing) **Database:** - github.com/lib/pq v1.10.9 (PostgreSQL - optional) - github.com/mattn/go-sqlite3 v1.14.32 (SQLite - optional) **Security:** - golang.org/x/crypto v0.42.0 (Cryptography) - golang.org/x/time v0.10.0 (Rate limiting) **Testing:** - github.com/stretchr/testify v1.11.1 (Test assertions) ### Smart Contract Dependencies **Generated Bindings (bindings/):** - Arbitrage Executor contract - Flash Swap contracts (Uniswap V2/V3) - ERC20 token interface - Uniswap V3 Pool interface - Balancer Vault interface --- ## 📌 Summary The MEV Bot is a **sophisticated, production-grade system** with: **✓ Strengths:** - Modular, testable architecture (5 layers, 47 packages) - Production-ready security infrastructure - Multi-protocol DEX support (6 protocols) - Intelligent rate limiting and failover - Robust error handling and recovery - Real-time health monitoring (97.97/100 score) - Comprehensive logging and analytics **⚠️ Pragmatic Limitations:** - Pool discovery disabled (uses cache: 314 pools) - Security manager disabled (KeyManager works) - No MEV protection (public mempool) - Single-chain only (Arbitrum) - In-memory state (no persistence) **Status:** **Ready for production** with current architecture (cache-based pools, independent KeyManager). Some advanced features disabled pending fixes (pool discovery, security manager). **Recommended Use:** Detection and analysis system. Execution capability exists but needs careful testing before live trading. --- **Last Updated:** November 2025 **Documentation Version:** 2.0 (reflects actual codebase state) **Codebase Version:** See git commit history for changes