Restructured project for V2 refactor: **Structure Changes:** - Moved all V1 code to orig/ folder (preserved with git mv) - Created docs/planning/ directory - Added orig/README_V1.md explaining V1 preservation **Planning Documents:** - 00_V2_MASTER_PLAN.md: Complete architecture overview - Executive summary of critical V1 issues - High-level component architecture diagrams - 5-phase implementation roadmap - Success metrics and risk mitigation - 07_TASK_BREAKDOWN.md: Atomic task breakdown - 99+ hours of detailed tasks - Every task < 2 hours (atomic) - Clear dependencies and success criteria - Organized by implementation phase **V2 Key Improvements:** - Per-exchange parsers (factory pattern) - Multi-layer strict validation - Multi-index pool cache - Background validation pipeline - Comprehensive observability **Critical Issues Addressed:** - Zero address tokens (strict validation + cache enrichment) - Parsing accuracy (protocol-specific parsers) - No audit trail (background validation channel) - Inefficient lookups (multi-index cache) - Stats disconnection (event-driven metrics) Next Steps: 1. Review planning documents 2. Begin Phase 1: Foundation (P1-001 through P1-010) 3. Implement parsers in Phase 2 4. Build cache system in Phase 3 5. Add validation pipeline in Phase 4 6. Migrate and test in Phase 5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
325 lines
12 KiB
Markdown
325 lines
12 KiB
Markdown
# MEV Bot V2 - Master Architecture Plan
|
|
|
|
## Executive Summary
|
|
|
|
V2 represents a complete architectural overhaul addressing critical parsing, validation, and scalability issues identified in V1. The rebuild focuses on:
|
|
|
|
1. **Zero Tolerance for Invalid Data**: Eliminate all zero addresses and zero amounts
|
|
2. **Per-Exchange Parser Architecture**: Individual parsers for each DEX type
|
|
3. **Real-time Validation Pipeline**: Background validation with audit trails
|
|
4. **Scalable Pool Discovery**: Efficient caching and multi-index lookups
|
|
5. **Observable System**: Comprehensive metrics, logging, and health monitoring
|
|
|
|
## Critical Issues from V1
|
|
|
|
### 1. Zero Address/Amount Problems
|
|
- **Root Cause**: Parser returns zero addresses when transaction data unavailable
|
|
- **Impact**: Invalid events submitted to scanner, wasted computation
|
|
- **V2 Solution**: Strict validation at multiple layers + pool cache enrichment
|
|
|
|
### 2. Parsing Accuracy Issues
|
|
- **Root Cause**: Monolithic parser handling all DEX types generically
|
|
- **Impact**: Missing token data, incorrect amounts, protocol-specific edge cases
|
|
- **V2 Solution**: Per-exchange parsers with protocol-specific logic
|
|
|
|
### 3. No Data Quality Audit Trail
|
|
- **Root Cause**: No validation or comparison of parsed data vs cached data
|
|
- **Impact**: Silent failures, no visibility into parsing degradation
|
|
- **V2 Solution**: Background validation channel with discrepancy logging
|
|
|
|
### 4. Inefficient Pool Lookups
|
|
- **Root Cause**: Single-index cache (by address only)
|
|
- **Impact**: Slow arbitrage path discovery, no ranking by liquidity
|
|
- **V2 Solution**: Multi-index cache (address, token pair, protocol, liquidity)
|
|
|
|
### 5. Stats Disconnection
|
|
- **Root Cause**: Events detected but not reflected in stats
|
|
- **Impact**: Monitoring blindness, unclear system health
|
|
- **V2 Solution**: Event-driven metrics with guaranteed consistency
|
|
|
|
## V2 Architecture Principles
|
|
|
|
### 1. **Fail-Fast with Visibility**
|
|
- Reject invalid data immediately at source
|
|
- Log all rejections with detailed context
|
|
- Never allow garbage data to propagate
|
|
|
|
### 2. **Single Responsibility**
|
|
- One parser per exchange type
|
|
- One validator per data type
|
|
- One cache per index type
|
|
|
|
### 3. **Observable by Default**
|
|
- Every component emits metrics
|
|
- Every operation is logged
|
|
- Every error has context
|
|
|
|
### 4. **Self-Healing**
|
|
- Automatic retry with exponential backoff
|
|
- Fallback to cache when RPC fails
|
|
- Circuit breakers for cascading failures
|
|
|
|
### 5. **Test-Driven**
|
|
- Unit tests for every parser
|
|
- Integration tests for full pipeline
|
|
- Chaos testing for failure scenarios
|
|
|
|
## High-Level Component Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Arbitrum Monitor │
|
|
│ - WebSocket subscription │
|
|
│ - Transaction/receipt buffering │
|
|
│ - Rate limiting & connection management │
|
|
└───────────────┬─────────────────────────────────────────────┘
|
|
│
|
|
├─ Transactions & Receipts
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Parser Factory │
|
|
│ - Route to correct parser based on protocol │
|
|
│ - Manage parser lifecycle │
|
|
└───────────────┬─────────────────────────────────────────────┘
|
|
│
|
|
┌──────────┼──────────┬──────────┬──────────┐
|
|
│ │ │ │ │
|
|
▼ ▼ ▼ ▼ ▼
|
|
┌─────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌────────┐
|
|
│Uniswap │ │Uniswap │ │SushiSwap│ │ Camelot │ │ Curve │
|
|
│V2 Parser│ │V3 Parser │ │ Parser │ │ Parser │ │ Parser │
|
|
└────┬────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └───┬────┘
|
|
│ │ │ │ │
|
|
└───────────┴────────────┴───────────┴───────────┘
|
|
│
|
|
▼
|
|
┌────────────────────────────────────────┐
|
|
│ Event Validation Layer │
|
|
│ - Check zero addresses │
|
|
│ - Check zero amounts │
|
|
│ - Validate against pool cache │
|
|
│ - Log discrepancies │
|
|
└────────────┬───────────────────────────┘
|
|
│
|
|
┌──────────┴──────────┐
|
|
│ │
|
|
▼ ▼
|
|
┌─────────────┐ ┌──────────────────┐
|
|
│ Scanner │ │ Background │
|
|
│ (Valid │ │ Validation │
|
|
│ Events) │ │ Channel │
|
|
└─────────────┘ │ (Audit Trail) │
|
|
└──────────────────┘
|
|
```
|
|
|
|
## V2 Directory Structure
|
|
|
|
```
|
|
mev-bot/
|
|
├── orig/ # V1 codebase preserved
|
|
│ ├── cmd/
|
|
│ ├── pkg/
|
|
│ ├── internal/
|
|
│ └── config/
|
|
│
|
|
├── docs/
|
|
│ └── planning/ # V2 planning documents
|
|
│ ├── 00_V2_MASTER_PLAN.md
|
|
│ ├── 01_PARSER_ARCHITECTURE.md
|
|
│ ├── 02_VALIDATION_PIPELINE.md
|
|
│ ├── 03_POOL_CACHE_SYSTEM.md
|
|
│ ├── 04_METRICS_OBSERVABILITY.md
|
|
│ ├── 05_DATA_FLOW.md
|
|
│ ├── 06_IMPLEMENTATION_PHASES.md
|
|
│ └── 07_TASK_BREAKDOWN.md
|
|
│
|
|
├── cmd/
|
|
│ └── mev-bot/
|
|
│ └── main.go # New V2 entry point
|
|
│
|
|
├── pkg/
|
|
│ ├── parsers/ # NEW: Per-exchange parsers
|
|
│ │ ├── factory.go
|
|
│ │ ├── interface.go
|
|
│ │ ├── uniswap_v2.go
|
|
│ │ ├── uniswap_v3.go
|
|
│ │ ├── sushiswap.go
|
|
│ │ ├── camelot.go
|
|
│ │ └── curve.go
|
|
│ │
|
|
│ ├── validation/ # NEW: Validation pipeline
|
|
│ │ ├── validator.go
|
|
│ │ ├── rules.go
|
|
│ │ ├── background.go
|
|
│ │ └── metrics.go
|
|
│ │
|
|
│ ├── cache/ # NEW: Multi-index cache
|
|
│ │ ├── pool_cache.go
|
|
│ │ ├── index_by_address.go
|
|
│ │ ├── index_by_tokens.go
|
|
│ │ ├── index_by_liquidity.go
|
|
│ │ └── index_by_protocol.go
|
|
│ │
|
|
│ ├── discovery/ # Pool discovery system
|
|
│ │ ├── scanner.go
|
|
│ │ ├── factory_watcher.go
|
|
│ │ └── blacklist.go
|
|
│ │
|
|
│ ├── monitor/ # Arbitrum monitoring
|
|
│ │ ├── sequencer.go
|
|
│ │ ├── connection.go
|
|
│ │ └── rate_limiter.go
|
|
│ │
|
|
│ ├── events/ # Event types and handling
|
|
│ │ ├── types.go
|
|
│ │ ├── router.go
|
|
│ │ └── processor.go
|
|
│ │
|
|
│ ├── arbitrage/ # Arbitrage detection
|
|
│ │ ├── detector.go
|
|
│ │ ├── calculator.go
|
|
│ │ └── executor.go
|
|
│ │
|
|
│ └── observability/ # NEW: Metrics & logging
|
|
│ ├── metrics.go
|
|
│ ├── logger.go
|
|
│ ├── tracing.go
|
|
│ └── health.go
|
|
│
|
|
├── internal/
|
|
│ ├── config/ # Configuration management
|
|
│ └── utils/ # Shared utilities
|
|
│
|
|
└── tests/
|
|
├── unit/ # Unit tests
|
|
├── integration/ # Integration tests
|
|
└── e2e/ # End-to-end tests
|
|
```
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Foundation (Weeks 1-2)
|
|
**Goal**: Set up V2 project structure and core interfaces
|
|
|
|
**Tasks**:
|
|
1. Create V2 directory structure
|
|
2. Define all interfaces (Parser, Validator, Cache, etc.)
|
|
3. Set up logging and metrics infrastructure
|
|
4. Create base test framework
|
|
5. Implement connection management
|
|
|
|
### Phase 2: Parser Refactor (Weeks 3-5)
|
|
**Goal**: Implement per-exchange parsers with validation
|
|
|
|
**Tasks**:
|
|
1. Create Parser interface and factory
|
|
2. Implement UniswapV2 parser with tests
|
|
3. Implement UniswapV3 parser with tests
|
|
4. Implement SushiSwap parser with tests
|
|
5. Implement Camelot parser with tests
|
|
6. Implement Curve parser with tests
|
|
7. Add strict validation layer
|
|
8. Integration testing
|
|
|
|
### Phase 3: Cache System (Weeks 6-7)
|
|
**Goal**: Multi-index pool cache with efficient lookups
|
|
|
|
**Tasks**:
|
|
1. Design cache schema
|
|
2. Implement address index
|
|
3. Implement token-pair index
|
|
4. Implement liquidity ranking index
|
|
5. Implement protocol index
|
|
6. Add cache persistence
|
|
7. Add cache invalidation logic
|
|
8. Performance testing
|
|
|
|
### Phase 4: Validation Pipeline (Weeks 8-9)
|
|
**Goal**: Background validation with audit trails
|
|
|
|
**Tasks**:
|
|
1. Create validation channel
|
|
2. Implement background validator goroutine
|
|
3. Add comparison logic (parsed vs cached)
|
|
4. Implement discrepancy logging
|
|
5. Create validation metrics
|
|
6. Add alerting for validation failures
|
|
7. Integration testing
|
|
|
|
### Phase 5: Migration & Testing (Weeks 10-12)
|
|
**Goal**: Migrate from V1 to V2, comprehensive testing
|
|
|
|
**Tasks**:
|
|
1. Create migration path
|
|
2. Run parallel systems (V1 and V2)
|
|
3. Compare outputs
|
|
4. Fix discrepancies
|
|
5. Load testing
|
|
6. Chaos testing
|
|
7. Production deployment
|
|
8. Monitoring setup
|
|
|
|
## Success Metrics
|
|
|
|
### Parsing Accuracy
|
|
- **Zero Address Rate**: < 0.01% (target: 0%)
|
|
- **Zero Amount Rate**: < 0.01% (target: 0%)
|
|
- **Validation Failure Rate**: < 0.5%
|
|
- **Cache Hit Rate**: > 95%
|
|
|
|
### Performance
|
|
- **Parse Time**: < 1ms per event (p99)
|
|
- **Cache Lookup**: < 0.1ms (p99)
|
|
- **End-to-end Latency**: < 10ms from receipt to scanner
|
|
|
|
### Reliability
|
|
- **Uptime**: > 99.9%
|
|
- **Data Discrepancy Rate**: < 0.1%
|
|
- **Event Drop Rate**: 0%
|
|
|
|
### Observability
|
|
- **All Events Logged**: 100%
|
|
- **All Rejections Logged**: 100%
|
|
- **Metrics Coverage**: 100% of components
|
|
|
|
## Risk Mitigation
|
|
|
|
### Risk: Breaking Changes During Migration
|
|
**Mitigation**:
|
|
- Run V1 and V2 in parallel
|
|
- Compare outputs
|
|
- Gradual rollout with feature flags
|
|
|
|
### Risk: Performance Degradation
|
|
**Mitigation**:
|
|
- Comprehensive benchmarking
|
|
- Load testing before deployment
|
|
- Circuit breakers for cascading failures
|
|
|
|
### Risk: Incomplete Test Coverage
|
|
**Mitigation**:
|
|
- TDD approach for all new code
|
|
- Minimum 90% test coverage requirement
|
|
- Integration and E2E tests mandatory
|
|
|
|
### Risk: Data Quality Regression
|
|
**Mitigation**:
|
|
- Continuous validation against Arbiscan
|
|
- Alerting on validation failures
|
|
- Automated rollback on critical issues
|
|
|
|
## Next Steps
|
|
|
|
1. Review and approve this master plan
|
|
2. Read detailed component plans in subsequent documents
|
|
3. Review task breakdown in `07_TASK_BREAKDOWN.md`
|
|
4. Begin Phase 1 implementation
|
|
|
|
---
|
|
|
|
**Document Status**: Draft for Review
|
|
**Created**: 2025-11-10
|
|
**Last Updated**: 2025-11-10
|
|
**Version**: 1.0
|