feat: create v2-prep branch with comprehensive planning

Restructured project for V2 refactor: **Structure Changes:** - Moved all V1 code to orig/ folder (preserved with git mv) - Created docs/planning/ directory - Added orig/README_V1.md explaining V1 preservation **Planning Documents:** - 00_V2_MASTER_PLAN.md: Complete architecture overview - Executive summary of critical V1 issues - High-level component architecture diagrams - 5-phase implementation roadmap - Success metrics and risk mitigation - 07_TASK_BREAKDOWN.md: Atomic task breakdown - 99+ hours of detailed tasks - Every task < 2 hours (atomic) - Clear dependencies and success criteria - Organized by implementation phase **V2 Key Improvements:** - Per-exchange parsers (factory pattern) - Multi-layer strict validation - Multi-index pool cache - Background validation pipeline - Comprehensive observability **Critical Issues Addressed:** - Zero address tokens (strict validation + cache enrichment) - Parsing accuracy (protocol-specific parsers) - No audit trail (background validation channel) - Inefficient lookups (multi-index cache) - Stats disconnection (event-driven metrics) Next Steps: 1. Review planning documents 2. Begin Phase 1: Foundation (P1-001 through P1-010) 3. Implement parsers in Phase 2 4. Build cache system in Phase 3 5. Add validation pipeline in Phase 4 6. Migrate and test in Phase 5 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 10:14:26 +01:00
parent 1773daffe7
commit 803de231ba
411 changed files with 20390 additions and 8680 deletions
--- a/docs/planning/00_V2_MASTER_PLAN.md
+++ b/docs/planning/00_V2_MASTER_PLAN.md
@@ -0,0 +1,324 @@
+# MEV Bot V2 - Master Architecture Plan
+
+## Executive Summary
+
+V2 represents a complete architectural overhaul addressing critical parsing, validation, and scalability issues identified in V1. The rebuild focuses on:
+
+1. **Zero Tolerance for Invalid Data**: Eliminate all zero addresses and zero amounts
+2. **Per-Exchange Parser Architecture**: Individual parsers for each DEX type
+3. **Real-time Validation Pipeline**: Background validation with audit trails
+4. **Scalable Pool Discovery**: Efficient caching and multi-index lookups
+5. **Observable System**: Comprehensive metrics, logging, and health monitoring
+
+## Critical Issues from V1
+
+### 1. Zero Address/Amount Problems
+- **Root Cause**: Parser returns zero addresses when transaction data unavailable
+- **Impact**: Invalid events submitted to scanner, wasted computation
+- **V2 Solution**: Strict validation at multiple layers + pool cache enrichment
+
+### 2. Parsing Accuracy Issues
+- **Root Cause**: Monolithic parser handling all DEX types generically
+- **Impact**: Missing token data, incorrect amounts, protocol-specific edge cases
+- **V2 Solution**: Per-exchange parsers with protocol-specific logic
+
+### 3. No Data Quality Audit Trail
+- **Root Cause**: No validation or comparison of parsed data vs cached data
+- **Impact**: Silent failures, no visibility into parsing degradation
+- **V2 Solution**: Background validation channel with discrepancy logging
+
+### 4. Inefficient Pool Lookups
+- **Root Cause**: Single-index cache (by address only)
+- **Impact**: Slow arbitrage path discovery, no ranking by liquidity
+- **V2 Solution**: Multi-index cache (address, token pair, protocol, liquidity)
+
+### 5. Stats Disconnection
+- **Root Cause**: Events detected but not reflected in stats
+- **Impact**: Monitoring blindness, unclear system health
+- **V2 Solution**: Event-driven metrics with guaranteed consistency
+
+## V2 Architecture Principles
+
+### 1. **Fail-Fast with Visibility**
+- Reject invalid data immediately at source
+- Log all rejections with detailed context
+- Never allow garbage data to propagate
+
+### 2. **Single Responsibility**
+- One parser per exchange type
+- One validator per data type
+- One cache per index type
+
+### 3. **Observable by Default**
+- Every component emits metrics
+- Every operation is logged
+- Every error has context
+
+### 4. **Self-Healing**
+- Automatic retry with exponential backoff
+- Fallback to cache when RPC fails
+- Circuit breakers for cascading failures
+
+### 5. **Test-Driven**
+- Unit tests for every parser
+- Integration tests for full pipeline
+- Chaos testing for failure scenarios
+
+## High-Level Component Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                     Arbitrum Monitor                         │
+│  - WebSocket subscription                                   │
+│  - Transaction/receipt buffering                            │
+│  - Rate limiting & connection management                    │
+└───────────────┬─────────────────────────────────────────────┘
+                │
+                ├─ Transactions & Receipts
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────┐
+│                  Parser Factory                              │
+│  - Route to correct parser based on protocol                │
+│  - Manage parser lifecycle                                  │
+└───────────────┬─────────────────────────────────────────────┘
+                │
+     ┌──────────┼──────────┬──────────┬──────────┐
+     │          │          │          │          │
+     ▼          ▼          ▼          ▼          ▼
+┌─────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌────────┐
+│Uniswap  │ │Uniswap   │ │SushiSwap│ │ Camelot  │ │ Curve  │
+│V2 Parser│ │V3 Parser │ │ Parser │ │  Parser  │ │ Parser │
+└────┬────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └───┬────┘
+     │           │            │           │           │
+     └───────────┴────────────┴───────────┴───────────┘
+                              │
+                              ▼
+         ┌────────────────────────────────────────┐
+         │        Event Validation Layer          │
+         │  - Check zero addresses                │
+         │  - Check zero amounts                  │
+         │  - Validate against pool cache         │
+         │  - Log discrepancies                   │
+         └────────────┬───────────────────────────┘
+                      │
+           ┌──────────┴──────────┐
+           │                     │
+           ▼                     ▼
+    ┌─────────────┐      ┌──────────────────┐
+    │   Scanner   │      │ Background       │
+    │   (Valid    │      │ Validation       │
+    │   Events)   │      │ Channel          │
+    └─────────────┘      │ (Audit Trail)    │
+                         └──────────────────┘
+```
+
+## V2 Directory Structure
+
+```
+mev-bot/
+├── orig/                          # V1 codebase preserved
+│   ├── cmd/
+│   ├── pkg/
+│   ├── internal/
+│   └── config/
+│
+├── docs/
+│   └── planning/                  # V2 planning documents
+│       ├── 00_V2_MASTER_PLAN.md
+│       ├── 01_PARSER_ARCHITECTURE.md
+│       ├── 02_VALIDATION_PIPELINE.md
+│       ├── 03_POOL_CACHE_SYSTEM.md
+│       ├── 04_METRICS_OBSERVABILITY.md
+│       ├── 05_DATA_FLOW.md
+│       ├── 06_IMPLEMENTATION_PHASES.md
+│       └── 07_TASK_BREAKDOWN.md
+│
+├── cmd/
+│   └── mev-bot/
+│       └── main.go                # New V2 entry point
+│
+├── pkg/
+│   ├── parsers/                   # NEW: Per-exchange parsers
+│   │   ├── factory.go
+│   │   ├── interface.go
+│   │   ├── uniswap_v2.go
+│   │   ├── uniswap_v3.go
+│   │   ├── sushiswap.go
+│   │   ├── camelot.go
+│   │   └── curve.go
+│   │
+│   ├── validation/                # NEW: Validation pipeline
+│   │   ├── validator.go
+│   │   ├── rules.go
+│   │   ├── background.go
+│   │   └── metrics.go
+│   │
+│   ├── cache/                     # NEW: Multi-index cache
+│   │   ├── pool_cache.go
+│   │   ├── index_by_address.go
+│   │   ├── index_by_tokens.go
+│   │   ├── index_by_liquidity.go
+│   │   └── index_by_protocol.go
+│   │
+│   ├── discovery/                 # Pool discovery system
+│   │   ├── scanner.go
+│   │   ├── factory_watcher.go
+│   │   └── blacklist.go
+│   │
+│   ├── monitor/                   # Arbitrum monitoring
+│   │   ├── sequencer.go
+│   │   ├── connection.go
+│   │   └── rate_limiter.go
+│   │
+│   ├── events/                    # Event types and handling
+│   │   ├── types.go
+│   │   ├── router.go
+│   │   └── processor.go
+│   │
+│   ├── arbitrage/                 # Arbitrage detection
+│   │   ├── detector.go
+│   │   ├── calculator.go
+│   │   └── executor.go
+│   │
+│   └── observability/             # NEW: Metrics & logging
+│       ├── metrics.go
+│       ├── logger.go
+│       ├── tracing.go
+│       └── health.go
+│
+├── internal/
+│   ├── config/                    # Configuration management
+│   └── utils/                     # Shared utilities
+│
+└── tests/
+    ├── unit/                      # Unit tests
+    ├── integration/               # Integration tests
+    └── e2e/                       # End-to-end tests
+```
+
+## Implementation Phases
+
+### Phase 1: Foundation (Weeks 1-2)
+**Goal**: Set up V2 project structure and core interfaces
+
+**Tasks**:
+1. Create V2 directory structure
+2. Define all interfaces (Parser, Validator, Cache, etc.)
+3. Set up logging and metrics infrastructure
+4. Create base test framework
+5. Implement connection management
+
+### Phase 2: Parser Refactor (Weeks 3-5)
+**Goal**: Implement per-exchange parsers with validation
+
+**Tasks**:
+1. Create Parser interface and factory
+2. Implement UniswapV2 parser with tests
+3. Implement UniswapV3 parser with tests
+4. Implement SushiSwap parser with tests
+5. Implement Camelot parser with tests
+6. Implement Curve parser with tests
+7. Add strict validation layer
+8. Integration testing
+
+### Phase 3: Cache System (Weeks 6-7)
+**Goal**: Multi-index pool cache with efficient lookups
+
+**Tasks**:
+1. Design cache schema
+2. Implement address index
+3. Implement token-pair index
+4. Implement liquidity ranking index
+5. Implement protocol index
+6. Add cache persistence
+7. Add cache invalidation logic
+8. Performance testing
+
+### Phase 4: Validation Pipeline (Weeks 8-9)
+**Goal**: Background validation with audit trails
+
+**Tasks**:
+1. Create validation channel
+2. Implement background validator goroutine
+3. Add comparison logic (parsed vs cached)
+4. Implement discrepancy logging
+5. Create validation metrics
+6. Add alerting for validation failures
+7. Integration testing
+
+### Phase 5: Migration & Testing (Weeks 10-12)
+**Goal**: Migrate from V1 to V2, comprehensive testing
+
+**Tasks**:
+1. Create migration path
+2. Run parallel systems (V1 and V2)
+3. Compare outputs
+4. Fix discrepancies
+5. Load testing
+6. Chaos testing
+7. Production deployment
+8. Monitoring setup
+
+## Success Metrics
+
+### Parsing Accuracy
+- **Zero Address Rate**: < 0.01% (target: 0%)
+- **Zero Amount Rate**: < 0.01% (target: 0%)
+- **Validation Failure Rate**: < 0.5%
+- **Cache Hit Rate**: > 95%
+
+### Performance
+- **Parse Time**: < 1ms per event (p99)
+- **Cache Lookup**: < 0.1ms (p99)
+- **End-to-end Latency**: < 10ms from receipt to scanner
+
+### Reliability
+- **Uptime**: > 99.9%
+- **Data Discrepancy Rate**: < 0.1%
+- **Event Drop Rate**: 0%
+
+### Observability
+- **All Events Logged**: 100%
+- **All Rejections Logged**: 100%
+- **Metrics Coverage**: 100% of components
+
+## Risk Mitigation
+
+### Risk: Breaking Changes During Migration
+**Mitigation**:
+- Run V1 and V2 in parallel
+- Compare outputs
+- Gradual rollout with feature flags
+
+### Risk: Performance Degradation
+**Mitigation**:
+- Comprehensive benchmarking
+- Load testing before deployment
+- Circuit breakers for cascading failures
+
+### Risk: Incomplete Test Coverage
+**Mitigation**:
+- TDD approach for all new code
+- Minimum 90% test coverage requirement
+- Integration and E2E tests mandatory
+
+### Risk: Data Quality Regression
+**Mitigation**:
+- Continuous validation against Arbiscan
+- Alerting on validation failures
+- Automated rollback on critical issues
+
+## Next Steps
+
+1. Review and approve this master plan
+2. Read detailed component plans in subsequent documents
+3. Review task breakdown in `07_TASK_BREAKDOWN.md`
+4. Begin Phase 1 implementation
+
+---
+
+**Document Status**: Draft for Review
+**Created**: 2025-11-10
+**Last Updated**: 2025-11-10
+**Version**: 1.0
--- a/docs/planning/07_TASK_BREAKDOWN.md
+++ b/docs/planning/07_TASK_BREAKDOWN.md