Completed clean root directory structure: - Root now contains only: .git, .env, docs/, orig/ - Moved all remaining files and directories to orig/: - Config files (.claude, .dockerignore, .drone.yml, etc.) - All .env variants (except active .env) - Git config (.gitconfig, .github, .gitignore, etc.) - Tool configs (.golangci.yml, .revive.toml, etc.) - Documentation (*.md files, @prompts) - Build files (Dockerfiles, Makefile, go.mod, go.sum) - Docker compose files - All source directories (scripts, tests, tools, etc.) - Runtime directories (logs, monitoring, reports) - Dependency files (node_modules, lib, cache) - Special files (--delete) - Removed empty runtime directories (bin/, data/) V2 structure is now clean: - docs/planning/ - V2 planning documents - orig/ - Complete V1 codebase preserved - .env - Active environment config (not in git) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
25 KiB
MEV Bot Project Specification
Version: 2.0 (November 2025 - Current State) Language: Go 1.24+ Target Chain: Arbitrum (Layer 2) Module: github.com/fraktal/mev-beta Codebase: 362 Go files, ~100,000+ lines of code
🎯 Project Overview
The MEV Bot is a production-grade arbitrage detection and analysis system for the Arbitrum network. It monitors decentralized exchanges (DEXs) in real-time using an event-driven architecture to identify profitable arbitrage opportunities across multiple protocols.
Core Capabilities
- Real-time Arbitrum Monitoring with sub-second latency via event-driven processing
- Multi-Protocol Support for Uniswap V2/V3, SushiSwap, Curve, Balancer, and more
- Advanced Transaction Parsing with sophisticated ABI decoding for complex multicalls
- Three-Pool RPC Architecture separating read-only, execution, and testing workloads
- Worker Pool Processing for concurrent event analysis (100+ events/sec capacity)
- Secure Key Management with AES-256-GCM encryption and hardware wallet support
- Production Logging System with health scoring, analytics, and automated archival
🏗️ System Architecture
Layered Architecture (5 Layers)
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Smart Contract Layer │
│ - Arbitrage executor contracts (bindings/) │
│ - Flash swap executors │
│ - Token and pool interfaces │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: Execution Layer │
│ - ArbitrageExecutor (pkg/arbitrage/executor.go - 1,641 LOC)│
│ - FlashSwapExecutor (pkg/arbitrage/flash_executor.go) │
│ - LiveExecutionFramework (real-time execution) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Detection & Analysis Layer │
│ - ArbitrageDetectionEngine (opportunity discovery) │
│ - MultiHopScanner (multi-hop path finding) │
│ - Scanner with worker pools (event processing) │
│ - DEX protocol implementations (6 protocols) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 4: Event Collection & Parsing Layer │
│ - ArbitrumMonitor (sequencer monitoring - 1,351 LOC) │
│ - L2Parser (transaction parsing - 1,985 LOC) │
│ - AbiDecoder (multicall decoding - 1,116 LOC) │
│ - EventParser (log parsing - 1,806 LOC) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 5: Infrastructure Layer │
│ - UnifiedProviderManager (3-pool RPC architecture) │
│ - PoolDiscovery (cache-based pool management) │
│ - KeyManager (secure signing - 1,841 LOC) │
│ - RateLimiter (per-endpoint limiting) │
└─────────────────────────────────────────────────────────────┘
Three-Pool RPC Architecture
The system uses three separate RPC endpoint pools for optimal performance:
UnifiedProviderManager
├─ ReadOnlyPool (50 RPS max)
│ └─ Used for: getBalance, call, getLogs, getCode
│ └─ High throughput for read-heavy operations
│
├─ ExecutionPool (20 RPS max)
│ └─ Used for: sendTransaction
│ └─ Reliable endpoints with lower limits
│
└─ TestingPool (10 RPS max)
└─ Used for: simulation, callStatic
└─ Isolated from production workload
Benefits:
- Execution transactions never rate-limited by read operations
- Independent failover per pool
- Optimized rate limits per endpoint capability
- Health checks and automatic endpoint rotation
📊 Core Components
1. Arbitrage Service (pkg/arbitrage/ - 17 files, 7,000+ LOC)
Primary Components:
- ArbitrageService (service.go - 1,995 LOC) - Main orchestration service
- ArbitrageExecutor (executor.go - 1,641 LOC) - Transaction execution
- FlashSwapExecutor (flash_executor.go - 1,462 LOC) - Flash swap logic
- MultiHopScanner (multihop.go - 892 LOC) - Multi-hop path detection
- DetectionEngine (detection_engine.go - 953 LOC) - Opportunity discovery
- LiveExecutionFramework (1,005 LOC) - Real-time execution
- NonceManager (3,843 LOC) - Transaction nonce management
- Database (13,129 LOC) - Opportunity persistence
Key Features:
- Event-driven arbitrage detection
- Multi-hop route optimization
- Gas-aware profit calculation
- Confidence scoring and risk assessment
- Real-time opportunity ranking
2. Arbitrum Integration (pkg/arbitrum/ - 34 files, 8,000+ LOC)
Primary Components:
- L2Parser (l2_parser.go - 1,985 LOC) - Advanced transaction parsing
- AbiDecoder (abi_decoder.go - 1,116 LOC) - Multicall decoding
- Parser (parser.go - 967 LOC) - Basic transaction parsing
- ConnectionManager (connection.go - 266 LOC) - RPC management
- SwapPipeline (swap_pipeline.go - 844 LOC) - Swap processing
- EventMonitor (event_monitor.go - 658 LOC) - Event monitoring
Capabilities:
- Handles complex multicall transactions
- Supports 10+ DEX router patterns
- Extracts token addresses and swap amounts
- ~90% parsing success rate on production data
- Graceful fallback for unknown patterns
3. Market Monitoring (pkg/monitor/ - 1,351 LOC)
ArbitrumMonitor:
- WebSocket subscription to Arbitrum sequencer
- High-throughput transaction processing (50,000 buffer)
- Automatic RPC failover and health monitoring
- Rate limiting and connection management
- Feeds parsed transactions to scanner
Performance:
- Processing: ~3-4 blocks/second sustained
- Latency: Sub-second block processing
- Uptime: 27+ minutes continuous (validated)
4. Scanner System (pkg/scanner/ - 5 subdirectories)
Architecture:
Scanner (concurrent.go)
├─ Worker Pool Pattern
│ ├─ Configurable worker count (4-8 default)
│ ├─ Non-blocking channel communication
│ └─ Graceful shutdown with WaitGroup
│
├─ MarketScanner (market/)
│ └─ Token pair and pool analysis
│
├─ SwapAnalyzer (swap/)
│ └─ Swap event detection and analysis
│
└─ LiquidityAnalyzer (analysis/)
└─ Liquidity change calculations
Performance:
- Throughput: 100+ events/second with 4-8 workers
- Latency: ~10-50ms per event analysis
- Concurrency: Independent worker processing
5. DEX Protocol Support (pkg/dex/ - 11 files)
| Protocol | Implementation | Fee Structure | Math Type |
|---|---|---|---|
| Uniswap V3 | uniswap_v3.go | 0.05%-1% | Concentrated liquidity, tick-based |
| Uniswap V2 | dex/ | 0.3% | Constant product (x×y=k) |
| SushiSwap | sushiswap.go | 0.3% | V2-compatible |
| Curve | curve.go | 0.04% | StableSwap invariant |
| Balancer | balancer.go | 0.3% | Weighted pool formula |
| 1inch | (referenced) | Variable | Aggregator support |
Protocol-Specific Features:
- V3: Tick-based price ranges with sqrt price math
- V2: Classic AMM formula with fee deduction
- Curve: Low-slippage stablecoin swaps
- Balancer: Multi-token weighted pools
6. Security & Key Management (pkg/security/ - 11 files, 5,000+ LOC)
Components:
- KeyManager (keymanager.go - 1,841 LOC) - Secure key generation, storage, signing
- RateLimiter (rate_limiter.go - 1,411 LOC) - DoS protection
- AuditAnalyzer (audit_analyzer.go - 1,646 LOC) - Audit logging
- PerformanceProfiler (1,316 LOC) - Performance metrics
- AnomalyDetector (1,069 LOC) - Suspicious activity detection
Security Features:
- AES-256-GCM encryption for private keys
- Hardware wallet support
- Automatic key rotation
- Comprehensive audit logging
- Rate limiting at multiple levels
🔄 Data Flow & Processing Pipeline
Complete Processing Flow
1. Arbitrum Block Stream (WebSocket)
↓
2. ArbitrumMonitor.Start()
- Subscribes to new blocks
- Fetches block transactions
↓
3. L2Parser.ParseTransaction()
- Decodes multicall with AbiDecoder
- Extracts function calls
- Identifies swap operations
↓
4. EventParser.ParseEvents()
- Decodes transaction receipt logs
- Extracts swap/liquidity events
- Parses pool state changes
↓
5. Scanner.ProcessEvents()
- Dispatches to worker pool
- MarketScanner analyzes token pairs
- SwapAnalyzer detects arbitrage patterns
- LiquidityAnalyzer calculates impacts
↓
6. ArbitrageService monitors results
- MultiHopScanner finds optimal paths
- DetectionEngine ranks opportunities
- Filters by confidence and profitability
↓
7. ArbitrageExecutor.ExecuteArbitrage()
- Simulates transaction
- Estimates gas costs
- Validates profitability
- Signs with KeyManager
- Submits to Arbitrum
↓
8. Results logged and persisted
Performance Characteristics
Latency Breakdown (Block → Detection):
1. Receive block: ~1ms
2. Fetch transaction: ~50-100ms (RPC)
3. Fetch receipt: ~50-100ms (RPC)
4. Parse transaction (ABI): ~10-50ms (CPU)
5. Parse events: ~5-20ms (CPU)
6. Analyze events (scanner): ~10-50ms (CPU)
7. Detect arbitrage: ~20-100ms (CPU + RPC)
─────────────────────────────────────────────
Total: ~150-450ms from block to detection
Observation: RPC calls dominate latency, not CPU processing.
⚙️ Configuration Management
Configuration Hierarchy
1. YAML Configuration Files (Base)
├─ config/arbitrum_production.yaml (tokens, DEX configs)
├─ config/providers.yaml (RPC endpoint pools)
└─ config/providers_runtime.yaml (runtime overrides)
2. Environment Variables (Override)
├─ GO_ENV (development|staging|production)
├─ MEV_BOT_ENCRYPTION_KEY (required)
├─ ARBITRUM_RPC_ENDPOINT
├─ ARBITRUM_WS_ENDPOINT
└─ LOG_LEVEL, DEBUG, METRICS_ENABLED
3. Runtime Configuration (Programmatic)
├─ Per-endpoint overrides
└─ Dynamic endpoint switching
Production Configuration Example
config/arbitrum_production.yaml:
tokens:
weth:
address: "0x82aF49447D8a07e3bd95BD0d56f35241523fBab1"
decimals: 18
coingecko_id: "weth"
usdc:
address: "0xaf88d065e77c8cC2239327C5EDb3A432268e5831"
decimals: 6
is_stable: true
# 20+ major tokens defined
dex_configs:
uniswap_v3:
factory: "0x1F98431c8aD98523631AE4a59f267346ea31F984"
router: "0xE592427A0AEce92De3Edee1F18E0157C05861564"
fee_tiers: [500, 3000, 10000]
arbitrage:
min_profit_threshold: "0.001" # 0.1%
max_slippage: "0.005" # 0.5%
max_gas_price: "50000000000" # 50 gwei
max_position_size: "100000000000000000000" # 100 ETH
config/providers.yaml:
read_only_pool:
endpoints:
- url: "https://arbitrum-mainnet.core.chainstack.com/..."
name: "chainstack-primary"
priority: 1
max_rps: 50
timeout: "10s"
- url: "https://arb1.arbitrum.io/rpc"
name: "arbitrum-public"
priority: 2
max_rps: 30
execution_pool:
endpoints:
- url: "https://arbitrum-mainnet.core.chainstack.com/..."
priority: 1
max_rps: 20
testing_pool:
endpoints:
- url: "https://arbitrum-mainnet.core.chainstack.com/..."
priority: 1
max_rps: 10
📈 Production Status & Performance
Current Implementation Status
✅ Production Ready:
- Real-time transaction parsing (~90% success rate)
- Event processing (100+ events/sec)
- Multi-protocol support (6 DEX protocols)
- Rate limiting and failover
- Secure key management
- Production logging with health scoring
⚠️ Partially Disabled (Workarounds Active):
- Pool discovery background task (uses cache-only, 314 pools loaded)
- Security manager (KeyManager works independently)
❌ Not Implemented:
- MEV protection (Flashbots, MEV-Share)
- Multi-chain support (Arbitrum only)
- Persistent opportunity database
- Machine learning-based detection
Performance Metrics (Validated)
| Metric | Value | Source |
|---|---|---|
| Startup Time | ~30 seconds | With pool cache |
| Event Processing | 100+ events/sec | Worker pool capacity |
| Detection Latency | 150-450ms | Block to opportunity |
| Memory Baseline | ~200MB | Pool cache + state |
| Memory Peak | ~500MB | Full operation |
| Health Score | 97.97/100 | Log analytics system |
| Error Rate | 2.03% | Log analysis |
| Parsing Success | ~90% | Transaction decoding |
| Uptime | 27+ minutes | Validated continuous |
System Requirements
Minimum:
- CPU: 2+ cores for concurrent processing
- RAM: 4GB+ for transaction buffering
- Network: Stable WebSocket connection
- Storage: 10GB+ for logs
Recommended:
- CPU: 4+ cores for optimal worker pools
- RAM: 8GB+ for larger pool cache
- Network: Multiple RPC providers for redundancy
- Storage: 50GB+ for long-term logging
🔬 Testing Infrastructure
Test Organization
tests/
├── integration/
│ ├── fork_test.go # Arbitrum fork testing
│ └── [other tests]
├── cache/ # Cache-related tests
├── contracts/ # Contract interaction tests
└── scenarios/ # Test scenarios
pkg/**/..._test.go # Unit tests colocated with source
Test Coverage
Unit Tests:
- arbitrage/: flash_executor_test.go, multihop_test.go
- arbitrum/: connection_test.go, parser_test.go, abi_fuzz_test.go
- scanner/: concurrent_test.go
- security/: keymanager_test.go
- validation/: pool_validator_test.go (1,155 lines)
Integration Tests:
- End-to-end transaction processing
- Multi-protocol detection accuracy
- Cross-protocol arbitrage detection
Build & Test Commands:
make build # Compile binary
make test # Run all tests
make test-coverage # Generate coverage report
make test-integration # Integration tests only
make lint # Run golangci-lint
make security-scan # Security analysis (gosec)
🚀 Deployment Guide
Prerequisites
# 1. Go 1.24 or later
go version
# 2. Create encryption key (32 bytes)
openssl rand -hex 16 > .env.encryption_key
# 3. Setup keystore
mkdir -p keystore
chmod 700 keystore
# 4. Configure environment
export GO_ENV=production
export MEV_BOT_ENCRYPTION_KEY=$(cat .env.encryption_key)
export ARBITRUM_RPC_ENDPOINT="wss://your-endpoint"
Quick Start
# Build the binary
make build
# Run the bot
./bin/mev-bot start
# Or with explicit config
GO_ENV=production ./bin/mev-bot start
Production Deployment
1. Configuration:
# Copy and customize production configs
cp config/arbitrum_production.yaml config/arbitrum_production.yaml.local
cp config/providers.yaml config/providers.yaml.local
# Edit with actual RPC endpoints and API keys
vim config/arbitrum_production.yaml.local
vim config/providers.yaml.local
2. Environment Setup:
# Create .env.production file
cat > .env.production <<EOF
GO_ENV=production
MEV_BOT_ENCRYPTION_KEY=<your-32-byte-hex-key>
MEV_BOT_KEYSTORE_PATH=keystore
ARBITRUM_RPC_ENDPOINT=wss://...
ARBITRUM_WS_ENDPOINT=wss://...
LOG_LEVEL=info
METRICS_ENABLED=true
EOF
3. Start Service:
# Load environment and start
source .env.production
./bin/mev-bot start
Monitoring & Health Checks
Production Logging System:
logs/
├── mev_bot.log # Main application log
├── mev_bot_errors.log # Error-specific log
├── mev_bot_performance.log # Performance metrics
├── analytics/ # Real-time analysis
│ ├── analysis_*.json # Comprehensive metrics
│ └── dashboard_*.html # Operations dashboard
├── health/ # Health monitoring
│ └── health_*.json # Health reports (97.97/100)
├── archives/ # Compressed rotated logs
└── rotated/ # Rotated log files
Health Check Commands:
# Real-time analysis with health scoring
./scripts/log-manager.sh analyze
# Check system health
./scripts/log-manager.sh health
# Full management cycle
./scripts/log-manager.sh full
# Start background monitoring daemon
./scripts/log-manager.sh start-daemon
Alert Thresholds:
- Error rate > 10% = Critical
- Health score < 80 = Warning
- Zero opportunities detected for >1 hour = Investigation needed
- Memory usage > 750MB = Pool pruning required
🔒 Security Considerations
Production Security
Key Management:
- AES-256-GCM encryption for all private keys
- Secure key derivation from master password
- Automatic key rotation support
- Hardware wallet integration ready
Input Validation:
- All external data validated before processing
- Token address validation (checksum, zero-address checks)
- Amount bounds checking (overflow protection)
- Gas price limits (max 50 gwei default)
Rate Limiting:
- Per-endpoint rate limits (configurable)
- Global transaction rate limiting
- Burst allowances for spike handling
- Automatic backoff on 429 responses
- Circuit breakers on repeated failures
Risk Management
Execution Safeguards:
- Configurable slippage protection (0.5% default max)
- Maximum transaction value limits (100 ETH default)
- Profit validation after gas costs
- Simulation before actual execution
- Confidence scoring (0.0-1.0 scale)
Error Handling:
- Comprehensive error handling at all layers
- Automatic retry with exponential backoff
- Fallback RPC providers
- Graceful degradation on failures
📝 Known Limitations & Future Enhancements
Current Limitations
1. Pool Discovery:
- Background discovery disabled (prevents startup hang)
- Relies on cached pool data (314 pools)
- No automatic new pool detection
- Workaround: Manual cache updates or restart
2. Security Manager:
- Comprehensive security manager disabled for debugging
- KeyManager works independently
- Missing some advanced security features
3. MEV Protection:
- No Flashbots integration
- No MEV-Share participation
- Transactions visible on public mempool
- Vulnerable to sandwich attacks
4. Single Chain:
- Arbitrum only (no Ethereum, Optimism, Base, etc.)
- No cross-chain arbitrage
- No bridge monitoring
5. In-Memory State:
- No persistent opportunity database
- Restarts lose historical context
- Limited long-term analytics
Planned Enhancements
High Priority:
- Re-enable pool discovery (fix hang issue)
- Re-enable security manager (identify and fix cause)
- Add persistent PostgreSQL database
- Implement MEV protection (Flashbots)
- Add Prometheus metrics export
Medium Priority:
- Multi-chain support (Optimism, Base)
- Flash loan integration (capital-free arbitrage)
- Machine learning opportunity prediction
- Advanced gas optimization
- WebSocket dashboard
Low Priority:
- MEV-Share integration
- Cross-chain bridge monitoring
- Collaborative MEV strategies
- Historical replay capability
📚 Documentation
Documentation Structure
docs/
├── CODEBASE_EXPLORATION_COMPLETE.md # Complete codebase analysis
├── IMPLEMENTATION_INSIGHTS.md # What code actually does
├── CODEBASE_QUICK_REFERENCE.md # Quick reference guide
├── CODEBASE_EXPLORATION_INDEX.md # Navigation index
├── DEVELOPER_DOCS.md # Developer documentation
├── MONITORING_GUIDE.md # Monitoring and operations
├── QUICK_START.md # Quick start guide
└── [100+ additional docs] # Historical and specialized docs
Key Documentation Files
For New Developers:
- CODEBASE_QUICK_REFERENCE.md - Start here
- CODEBASE_EXPLORATION_COMPLETE.md - Deep dive
- DEVELOPER_DOCS.md - Development guidelines
For Operations:
- MONITORING_GUIDE.md - Production monitoring
- Log manager scripts (scripts/log-manager.sh)
- Health check procedures
For Understanding Architecture:
- IMPLEMENTATION_INSIGHTS.md - Reality vs documentation
- CODEBASE_EXPLORATION_INDEX.md - Component navigation
- This specification (PROJECT_SPECIFICATION.md)
🎯 Getting Started
For Developers
1. Understand the codebase:
# Read these in order:
cat docs/CODEBASE_QUICK_REFERENCE.md
cat docs/IMPLEMENTATION_INSIGHTS.md
cat docs/CODEBASE_EXPLORATION_COMPLETE.md
2. Build and test:
make build
make test
3. Run in development:
export GO_ENV=development
export MEV_BOT_ENCRYPTION_KEY=$(openssl rand -hex 16)
./bin/mev-bot start
For Operations
1. Deploy to production:
# Follow deployment guide above
source .env.production
./bin/mev-bot start
2. Monitor health:
# Check health score (target: >95)
./scripts/log-manager.sh health
# Real-time monitoring
./scripts/log-manager.sh start-daemon
3. Troubleshoot issues:
# Analyze logs
./scripts/log-manager.sh analyze
# View latest errors
tail -100 logs/mev_bot_errors.log
# Check specific issues in main.go debug checkpoints (20 total)
grep "CHECKPOINT" logs/mev_bot.log
📊 Performance Expectations
MEV Profit Expectations (Arbitrum Realistic)
Based on current market conditions:
- Arbitrage Frequency: 5-20 opportunities per day (market dependent)
- Profit per Trade: 0.1-0.5% typical ($2-$10 on $1,000 capital)
- Daily Target: $10-$200 with moderate capital and optimal conditions
- Time to First Detection: ~30 seconds from startup
- Time to First Opportunity: 30-60 minutes (market dependent)
Note: These are detection rates. Actual execution profits depend on:
- Gas costs (50-150k gas per execution)
- Slippage during execution
- Competition from other MEV bots
- Market volatility
🔗 External Dependencies
Go Module Dependencies
Primary:
- github.com/ethereum/go-ethereum v1.16.3 (Ethereum client library)
- github.com/gorilla/websocket v1.5.3 (WebSocket support)
- github.com/holiman/uint256 v1.3.2 (256-bit integers)
- github.com/urfave/cli/v2 v2.27.5 (CLI framework)
- gopkg.in/yaml.v3 (YAML parsing)
Database:
- github.com/lib/pq v1.10.9 (PostgreSQL - optional)
- github.com/mattn/go-sqlite3 v1.14.32 (SQLite - optional)
Security:
- golang.org/x/crypto v0.42.0 (Cryptography)
- golang.org/x/time v0.10.0 (Rate limiting)
Testing:
- github.com/stretchr/testify v1.11.1 (Test assertions)
Smart Contract Dependencies
Generated Bindings (bindings/):
- Arbitrage Executor contract
- Flash Swap contracts (Uniswap V2/V3)
- ERC20 token interface
- Uniswap V3 Pool interface
- Balancer Vault interface
📌 Summary
The MEV Bot is a sophisticated, production-grade system with:
✓ Strengths:
- Modular, testable architecture (5 layers, 47 packages)
- Production-ready security infrastructure
- Multi-protocol DEX support (6 protocols)
- Intelligent rate limiting and failover
- Robust error handling and recovery
- Real-time health monitoring (97.97/100 score)
- Comprehensive logging and analytics
⚠️ Pragmatic Limitations:
- Pool discovery disabled (uses cache: 314 pools)
- Security manager disabled (KeyManager works)
- No MEV protection (public mempool)
- Single-chain only (Arbitrum)
- In-memory state (no persistence)
Status: Ready for production with current architecture (cache-based pools, independent KeyManager). Some advanced features disabled pending fixes (pool discovery, security manager).
Recommended Use: Detection and analysis system. Execution capability exists but needs careful testing before live trading.
Last Updated: November 2025 Documentation Version: 2.0 (reflects actual codebase state) Codebase Version: See git commit history for changes