# MEV Bot Technical Specification ## Project Overview High-performance MEV bot for Arbitrum focused on real-time swap detection and arbitrage opportunities from the Arbitrum sequencer feed. ## Core Architecture Principles ### 1. Channel-Based Concurrency **ALL processing, parsing, and logging MUST use Go channels for optimal performance** - Non-blocking message passing between components - Worker pools for parallel processing - Buffered channels to prevent backpressure - No synchronous blocking operations in hot paths ### 2. Sequencer-First Architecture **The Arbitrum sequencer feed is the PRIMARY data source** - WebSocket connection to: `wss://arb1.arbitrum.io/feed` - Real-time transaction broadcast before inclusion in blocks - NO reliance on HTTP RPC endpoints except for historical data - Sequencer MUST be isolated in its own channel ### 3. Official Contract Sources **ALL contract ABIs MUST be derived from official contract sources** - Store official DEX contracts in `contracts/lib/` via Foundry - Build contracts using Foundry (`forge build`) - Extract ABIs from build artifacts in `contracts/out/` - Generate Go bindings using `abigen` from extracted ABIs - ALL contracts in `contracts/src/` MUST have bindings - NO manually written ABI JSON files - NO hardcoded function selectors ## Sequencer Processing Pipeline ### Stage 1: Message Reception ``` Arbitrum Sequencer Feed ↓ [Raw WebSocket Messages] ↓ Message Channel ``` ### Stage 2: Swap Filtering ``` Message Channel ↓ [Swap Filter Workers] ← Pool Cache (read-only) ↓ Swap Event Channel ``` **Swap Filter Responsibilities:** - Identify swap transactions from supported DEXes - Extract pool addresses from transactions - Discover new pools not in cache - Emit SwapEvent to downstream channel **Supported DEXes:** - Uniswap V2/V3/V4 - Camelot V2/V3/V4 - Balancer (all versions) - Kyber (all versions) - Curve (all versions) - SushiSwap - Other UniswapV2-compatible exchanges ### Stage 3: Pool Discovery ``` Swap Event Channel ↓ [Pool Discovery] ↓ Pool Cache ← Auto-save to disk ↓ Pool Mapping (address → info) ``` **Pool Cache Behavior:** - Thread-safe concurrent access (RWMutex) - Automatic persistence to JSON every 100 new pools - Periodic saves every 5 minutes - Mapping prevents duplicate processing - First seen timestamp tracking - Swap count statistics ### Stage 4: Arbitrage Detection ``` Swap Event Channel ↓ [Arbitrage Scanner] ← Pool Cache (multi-index) ↓ Opportunity Channel ``` ## Contract Bindings Management ### Directory Structure ``` contracts/ ├── lib/ # Foundry dependencies (official DEX contracts) │ ├── v2-core/ # git submodule: Uniswap/v2-core │ ├── v3-core/ # git submodule: Uniswap/v3-core │ ├── camelot-amm/ # git submodule: CamelotLabs/camelot-amm-v2 │ └── ... ├── src/ # Custom wrapper contracts (if needed) │ └── interfaces/ # Interface contracts for binding generation ├── out/ # Foundry build artifacts (gitignored) │ └── *.sol/ │ └── *.json # ABI + bytecode └── foundry.toml # Foundry configuration bindings/ ├── uniswap_v2/ │ ├── router.go # Generated from IUniswapV2Router02 │ └── pair.go # Generated from IUniswapV2Pair ├── uniswap_v3/ │ └── router.go # Generated from ISwapRouter ├── camelot/ │ └── router.go # Generated from ICamelotRouter └── README.md # Binding usage documentation ``` ### Binding Generation Workflow 1. **Install Official Contracts** ```bash forge install Uniswap/v2-core forge install Uniswap/v3-core forge install Uniswap/v4-core forge install camelotlabs/camelot-amm-v2 forge install balancer/balancer-v2-monorepo forge install KyberNetwork/ks-elastic-sc forge install curvefi/curve-contract ``` 2. **Build Contracts** ```bash forge build ``` 3. **Extract ABIs** ```bash # Example for UniswapV2Router02 jq '.abi' contracts/out/IUniswapV2Router02.sol/IUniswapV2Router02.json > /tmp/router_abi.json ``` 4. **Generate Bindings** ```bash abigen --abi=/tmp/router_abi.json \ --pkg=uniswap_v2 \ --type=UniswapV2Router \ --out=bindings/uniswap_v2/router.go ``` 5. **Automate with Script** - Use `scripts/generate-bindings.sh` to automate steps 3-4 - Run after any contract update ### Binding Usage in Code **DO THIS** (ABI-based detection): ```go import ( "github.com/ethereum/go-ethereum/accounts/abi" "strings" ) routerABI, _ := abi.JSON(strings.NewReader(uniswap_v2.UniswapV2RouterABI)) method, err := routerABI.MethodById(txData[:4]) if err == nil { isSwap := strings.Contains(method.Name, "swap") if isSwap { params, _ := method.Inputs.Unpack(txData[4:]) // Type-safe parameter access amountIn := params[0].(*big.Int) path := params[2].([]common.Address) } } ``` **DON'T DO THIS** (hardcoded selectors): ```go // WRONG - hardcoded, fragile, unmaintainable if hex.EncodeToString(txData[0:4]) == "38ed1739" { // swapExactTokensForTokens } ``` ## Pool Cache Design ### Multi-Index Requirements The pool cache MUST support efficient lookups by: 1. **Address** - Primary key 2. **Token Pair** - Find all pools for a pair (A,B) 3. **Protocol** - Find all Uniswap pools, all Camelot pools, etc. 4. **Liquidity** - Find top N pools by TVL ### Data Structure ```go type PoolInfo struct { Address common.Address Protocol string // "UniswapV2", "Camelot", etc. Version string // "V2", "V3", etc. Token0 common.Address Token1 common.Address Fee uint32 // basis points FirstSeen time.Time LastSeen time.Time SwapCount uint64 Liquidity *big.Int // Estimated TVL } type PoolCache struct { // Primary storage pools map[common.Address]*PoolInfo // Indexes byTokenPair map[TokenPair][]common.Address byProtocol map[string][]common.Address byLiquidity []*PoolInfo // Sorted by liquidity mu sync.RWMutex } ``` ### Thread Safety - Use `RWMutex` for concurrent read/write access - Read locks for queries - Write locks for updates - No locks held during I/O operations (save to disk) ## Development Environment ### Containerized Development **ALL development MUST occur in containers** ```yaml # docker-compose.yml profiles services: go-dev: # Go 1.21 with full toolchain python-dev: # Python 3.11 for scripts foundry: # Forge, Cast, Anvil for contract work ``` **Start dev environment:** ```bash ./scripts/dev-up.sh # or podman-compose up -d go-dev python-dev foundry ``` **Enter containers:** ```bash podman exec -it mev-go-dev sh podman exec -it mev-foundry sh ``` ### Build Process ```bash # In go-dev container cd /workspace go build -o bin/mev-bot ./cmd/mev-bot/main.go ``` ### Testing ```bash # Unit tests go test ./pkg/... -v # Integration tests go test ./tests/integration/... -v # Benchmarks go test ./pkg/... -bench=. -benchmem ``` ## Observability ### Metrics (Prometheus) Every component MUST export metrics: - `sequencer_messages_received_total` - `swaps_detected_total{protocol, version}` - `pools_discovered_total{protocol}` - `arbitrage_opportunities_found_total` - `arbitrage_execution_attempts_total{result}` ### Logging (Structured) Use go-ethereum's structured logger: ```go logger.Info("swap detected", "protocol", swap.Protocol.Name, "hash", swap.TxHash, "pool", swap.Pool.Address.Hex(), "token0", swap.Pool.Token0.Hex(), "token1", swap.Pool.Token1.Hex()) ``` ### Health Monitoring - Sequencer connection status - Message processing rate - Channel buffer utilization - Pool cache hit rate - Arbitrage execution success rate ## Validation Rules ### Swap Event Validation MUST validate ALL parsed swap events: 1. **Non-zero addresses** - token0, token1, pool address 2. **Non-zero amounts** - amountIn, amountOut 3. **Valid token pair** - token0 < token1 (canonical ordering) 4. **Known protocol** - matches supported DEX list 5. **Reasonable amounts** - within sanity bounds ### Reject Invalid Data Immediately - Log rejection with full context - Increment rejection metrics - NEVER propagate invalid data downstream ## Error Handling ### Fail-Fast Philosophy - Reject bad data at the source - Log all errors with stack traces - Emit error metrics - Never silent failures ### Graceful Degradation - Circuit breakers for RPC failover - Retry logic with exponential backoff - Automatic reconnection for WebSocket - Pool cache persistence survives restarts ## Configuration ### Environment Variables ```bash # Sequencer (PRIMARY) ARBITRUM_SEQUENCER_URL=wss://arb1.arbitrum.io/feed # RPC (FALLBACK ONLY) RPC_URL=https://arbitrum-mainnet.core.chainstack.com/ WS_URL=wss://arbitrum-mainnet.core.chainstack.com/ # Chain CHAIN_ID=42161 # API Keys ARBISCAN_API_KEY= # Wallet PRIVATE_KEY= ``` ### Performance Tuning ```bash # Worker pool sizes SWAP_FILTER_WORKERS=16 ARBITRAGE_WORKERS=8 # Channel buffer sizes MESSAGE_BUFFER=1000 SWAP_EVENT_BUFFER=500 OPPORTUNITY_BUFFER=100 # Pool cache POOL_CACHE_AUTOSAVE_COUNT=100 POOL_CACHE_AUTOSAVE_INTERVAL=5m ``` ## Git Workflow ### Branches - `master` - Stable production branch - `feature/v2-prep` - V2 planning and architecture - `feature/` - Feature branches for V2 components ### Commit Messages ``` type(scope): brief description - Detailed changes - Why the change was needed - Breaking changes or migration notes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude ``` **Types**: `feat`, `fix`, `perf`, `refactor`, `test`, `docs`, `build`, `ci` ## Critical Rules ### MUST DO ✅ Use Arbitrum sequencer feed as primary data source ✅ Use channels for ALL inter-component communication ✅ Derive contract ABIs from official sources via Foundry ✅ Generate Go bindings for all contracts with `abigen` ✅ Validate ALL parsed data before propagation ✅ Use thread-safe concurrent data structures ✅ Emit comprehensive metrics and structured logs ✅ Run all development in containers ✅ Write tests for all components ### MUST NOT DO ❌ Use HTTP RPC as primary data source (sequencer only!) ❌ Write manual ABI JSON files (use Foundry builds!) ❌ Hardcode function selectors (use ABI lookups!) ❌ Allow zero addresses or zero amounts to propagate ❌ Use blocking operations in hot paths ❌ Modify shared state without locks ❌ Silent failures without logging ❌ Run builds outside of containers ## References - [Arbitrum Sequencer Feed](https://www.degencode.com/p/decoding-the-arbitrum-sequencer-feed) - [Foundry Book](https://book.getfoundry.sh/) - [Abigen Documentation](https://geth.ethereum.org/docs/tools/abigen) - V2 Architecture: `docs/planning/00_V2_MASTER_PLAN.md` - V2 Task Breakdown: `docs/planning/07_TASK_BREAKDOWN.md` - Project Guidelines: `CLAUDE.md`