# MEV Bot V2 - Master Architecture Plan ## Executive Summary V2 represents a complete architectural overhaul addressing critical parsing, validation, and scalability issues identified in V1. The rebuild focuses on: 1. **Zero Tolerance for Invalid Data**: Eliminate all zero addresses and zero amounts 2. **Per-Exchange Parser Architecture**: Individual parsers for each DEX type 3. **Real-time Validation Pipeline**: Background validation with audit trails 4. **Scalable Pool Discovery**: Efficient caching and multi-index lookups 5. **Observable System**: Comprehensive metrics, logging, and health monitoring ## Critical Issues from V1 ### 1. Zero Address/Amount Problems - **Root Cause**: Parser returns zero addresses when transaction data unavailable - **Impact**: Invalid events submitted to scanner, wasted computation - **V2 Solution**: Strict validation at multiple layers + pool cache enrichment ### 2. Parsing Accuracy Issues - **Root Cause**: Monolithic parser handling all DEX types generically - **Impact**: Missing token data, incorrect amounts, protocol-specific edge cases - **V2 Solution**: Per-exchange parsers with protocol-specific logic ### 3. No Data Quality Audit Trail - **Root Cause**: No validation or comparison of parsed data vs cached data - **Impact**: Silent failures, no visibility into parsing degradation - **V2 Solution**: Background validation channel with discrepancy logging ### 4. Inefficient Pool Lookups - **Root Cause**: Single-index cache (by address only) - **Impact**: Slow arbitrage path discovery, no ranking by liquidity - **V2 Solution**: Multi-index cache (address, token pair, protocol, liquidity) ### 5. Stats Disconnection - **Root Cause**: Events detected but not reflected in stats - **Impact**: Monitoring blindness, unclear system health - **V2 Solution**: Event-driven metrics with guaranteed consistency ## V2 Architecture Principles ### 1. **Fail-Fast with Visibility** - Reject invalid data immediately at source - Log all rejections with detailed context - Never allow garbage data to propagate ### 2. **Single Responsibility** - One parser per exchange type - One validator per data type - One cache per index type ### 3. **Observable by Default** - Every component emits metrics - Every operation is logged - Every error has context ### 4. **Self-Healing** - Automatic retry with exponential backoff - Fallback to cache when RPC fails - Circuit breakers for cascading failures ### 5. **Test-Driven** - Unit tests for every parser - Integration tests for full pipeline - Chaos testing for failure scenarios ## High-Level Component Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Arbitrum Monitor │ │ - WebSocket subscription │ │ - Transaction/receipt buffering │ │ - Rate limiting & connection management │ └───────────────┬─────────────────────────────────────────────┘ │ ├─ Transactions & Receipts │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Parser Factory │ │ - Route to correct parser based on protocol │ │ - Manage parser lifecycle │ └───────────────┬─────────────────────────────────────────────┘ │ ┌──────────┼──────────┬──────────┬──────────┐ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌─────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌────────┐ │Uniswap │ │Uniswap │ │SushiSwap│ │ Camelot │ │ Curve │ │V2 Parser│ │V3 Parser │ │ Parser │ │ Parser │ │ Parser │ └────┬────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └───┬────┘ │ │ │ │ │ └───────────┴────────────┴───────────┴───────────┘ │ ▼ ┌────────────────────────────────────────┐ │ Event Validation Layer │ │ - Check zero addresses │ │ - Check zero amounts │ │ - Validate against pool cache │ │ - Log discrepancies │ └────────────┬───────────────────────────┘ │ ┌──────────┴──────────┐ │ │ ▼ ▼ ┌─────────────┐ ┌──────────────────┐ │ Scanner │ │ Background │ │ (Valid │ │ Validation │ │ Events) │ │ Channel │ └─────────────┘ │ (Audit Trail) │ └──────────────────┘ ``` ## V2 Directory Structure ``` mev-bot/ ├── orig/ # V1 codebase preserved │ ├── cmd/ │ ├── pkg/ │ ├── internal/ │ └── config/ │ ├── docs/ │ └── planning/ # V2 planning documents │ ├── 00_V2_MASTER_PLAN.md │ ├── 01_PARSER_ARCHITECTURE.md │ ├── 02_VALIDATION_PIPELINE.md │ ├── 03_POOL_CACHE_SYSTEM.md │ ├── 04_METRICS_OBSERVABILITY.md │ ├── 05_DATA_FLOW.md │ ├── 06_IMPLEMENTATION_PHASES.md │ └── 07_TASK_BREAKDOWN.md │ ├── cmd/ │ └── mev-bot/ │ └── main.go # New V2 entry point │ ├── pkg/ │ ├── parsers/ # NEW: Per-exchange parsers │ │ ├── factory.go │ │ ├── interface.go │ │ ├── uniswap_v2.go │ │ ├── uniswap_v3.go │ │ ├── sushiswap.go │ │ ├── camelot.go │ │ └── curve.go │ │ │ ├── validation/ # NEW: Validation pipeline │ │ ├── validator.go │ │ ├── rules.go │ │ ├── background.go │ │ └── metrics.go │ │ │ ├── cache/ # NEW: Multi-index cache │ │ ├── pool_cache.go │ │ ├── index_by_address.go │ │ ├── index_by_tokens.go │ │ ├── index_by_liquidity.go │ │ └── index_by_protocol.go │ │ │ ├── discovery/ # Pool discovery system │ │ ├── scanner.go │ │ ├── factory_watcher.go │ │ └── blacklist.go │ │ │ ├── monitor/ # Arbitrum monitoring │ │ ├── sequencer.go │ │ ├── connection.go │ │ └── rate_limiter.go │ │ │ ├── events/ # Event types and handling │ │ ├── types.go │ │ ├── router.go │ │ └── processor.go │ │ │ ├── arbitrage/ # Arbitrage detection │ │ ├── detector.go │ │ ├── calculator.go │ │ └── executor.go │ │ │ └── observability/ # NEW: Metrics & logging │ ├── metrics.go │ ├── logger.go │ ├── tracing.go │ └── health.go │ ├── internal/ │ ├── config/ # Configuration management │ └── utils/ # Shared utilities │ └── tests/ ├── unit/ # Unit tests ├── integration/ # Integration tests └── e2e/ # End-to-end tests ``` ## Implementation Phases ### Phase 1: Foundation (Weeks 1-2) **Goal**: Set up V2 project structure and core interfaces **Tasks**: 1. Create V2 directory structure 2. Define all interfaces (Parser, Validator, Cache, etc.) 3. Set up logging and metrics infrastructure 4. Create base test framework 5. Implement connection management ### Phase 2: Parser Refactor (Weeks 3-5) **Goal**: Implement per-exchange parsers with validation **Tasks**: 1. Create Parser interface and factory 2. Implement UniswapV2 parser with tests 3. Implement UniswapV3 parser with tests 4. Implement SushiSwap parser with tests 5. Implement Camelot parser with tests 6. Implement Curve parser with tests 7. Add strict validation layer 8. Integration testing ### Phase 3: Cache System (Weeks 6-7) **Goal**: Multi-index pool cache with efficient lookups **Tasks**: 1. Design cache schema 2. Implement address index 3. Implement token-pair index 4. Implement liquidity ranking index 5. Implement protocol index 6. Add cache persistence 7. Add cache invalidation logic 8. Performance testing ### Phase 4: Validation Pipeline (Weeks 8-9) **Goal**: Background validation with audit trails **Tasks**: 1. Create validation channel 2. Implement background validator goroutine 3. Add comparison logic (parsed vs cached) 4. Implement discrepancy logging 5. Create validation metrics 6. Add alerting for validation failures 7. Integration testing ### Phase 5: Migration & Testing (Weeks 10-12) **Goal**: Migrate from V1 to V2, comprehensive testing **Tasks**: 1. Create migration path 2. Run parallel systems (V1 and V2) 3. Compare outputs 4. Fix discrepancies 5. Load testing 6. Chaos testing 7. Production deployment 8. Monitoring setup ## Success Metrics ### Parsing Accuracy - **Zero Address Rate**: < 0.01% (target: 0%) - **Zero Amount Rate**: < 0.01% (target: 0%) - **Validation Failure Rate**: < 0.5% - **Cache Hit Rate**: > 95% ### Performance - **Parse Time**: < 1ms per event (p99) - **Cache Lookup**: < 0.1ms (p99) - **End-to-end Latency**: < 10ms from receipt to scanner ### Reliability - **Uptime**: > 99.9% - **Data Discrepancy Rate**: < 0.1% - **Event Drop Rate**: 0% ### Observability - **All Events Logged**: 100% - **All Rejections Logged**: 100% - **Metrics Coverage**: 100% of components ## Risk Mitigation ### Risk: Breaking Changes During Migration **Mitigation**: - Run V1 and V2 in parallel - Compare outputs - Gradual rollout with feature flags ### Risk: Performance Degradation **Mitigation**: - Comprehensive benchmarking - Load testing before deployment - Circuit breakers for cascading failures ### Risk: Incomplete Test Coverage **Mitigation**: - TDD approach for all new code - Minimum 90% test coverage requirement - Integration and E2E tests mandatory ### Risk: Data Quality Regression **Mitigation**: - Continuous validation against Arbiscan - Alerting on validation failures - Automated rollback on critical issues ## Next Steps 1. Review and approve this master plan 2. Read detailed component plans in subsequent documents 3. Review task breakdown in `07_TASK_BREAKDOWN.md` 4. Begin Phase 1 implementation --- **Document Status**: Draft for Review **Created**: 2025-11-10 **Last Updated**: 2025-11-10 **Version**: 1.0