feat: create v2-prep branch with comprehensive planning

Restructured project for V2 refactor:

**Structure Changes:**
- Moved all V1 code to orig/ folder (preserved with git mv)
- Created docs/planning/ directory
- Added orig/README_V1.md explaining V1 preservation

**Planning Documents:**
- 00_V2_MASTER_PLAN.md: Complete architecture overview
  - Executive summary of critical V1 issues
  - High-level component architecture diagrams
  - 5-phase implementation roadmap
  - Success metrics and risk mitigation

- 07_TASK_BREAKDOWN.md: Atomic task breakdown
  - 99+ hours of detailed tasks
  - Every task < 2 hours (atomic)
  - Clear dependencies and success criteria
  - Organized by implementation phase

**V2 Key Improvements:**
- Per-exchange parsers (factory pattern)
- Multi-layer strict validation
- Multi-index pool cache
- Background validation pipeline
- Comprehensive observability

**Critical Issues Addressed:**
- Zero address tokens (strict validation + cache enrichment)
- Parsing accuracy (protocol-specific parsers)
- No audit trail (background validation channel)
- Inefficient lookups (multi-index cache)
- Stats disconnection (event-driven metrics)

Next Steps:
1. Review planning documents
2. Begin Phase 1: Foundation (P1-001 through P1-010)
3. Implement parsers in Phase 2
4. Build cache system in Phase 3
5. Add validation pipeline in Phase 4
6. Migrate and test in Phase 5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Administrator
2025-11-10 10:14:26 +01:00
parent 1773daffe7
commit 803de231ba
411 changed files with 20390 additions and 8680 deletions

View File

@@ -0,0 +1,324 @@
# MEV Bot V2 - Master Architecture Plan
## Executive Summary
V2 represents a complete architectural overhaul addressing critical parsing, validation, and scalability issues identified in V1. The rebuild focuses on:
1. **Zero Tolerance for Invalid Data**: Eliminate all zero addresses and zero amounts
2. **Per-Exchange Parser Architecture**: Individual parsers for each DEX type
3. **Real-time Validation Pipeline**: Background validation with audit trails
4. **Scalable Pool Discovery**: Efficient caching and multi-index lookups
5. **Observable System**: Comprehensive metrics, logging, and health monitoring
## Critical Issues from V1
### 1. Zero Address/Amount Problems
- **Root Cause**: Parser returns zero addresses when transaction data unavailable
- **Impact**: Invalid events submitted to scanner, wasted computation
- **V2 Solution**: Strict validation at multiple layers + pool cache enrichment
### 2. Parsing Accuracy Issues
- **Root Cause**: Monolithic parser handling all DEX types generically
- **Impact**: Missing token data, incorrect amounts, protocol-specific edge cases
- **V2 Solution**: Per-exchange parsers with protocol-specific logic
### 3. No Data Quality Audit Trail
- **Root Cause**: No validation or comparison of parsed data vs cached data
- **Impact**: Silent failures, no visibility into parsing degradation
- **V2 Solution**: Background validation channel with discrepancy logging
### 4. Inefficient Pool Lookups
- **Root Cause**: Single-index cache (by address only)
- **Impact**: Slow arbitrage path discovery, no ranking by liquidity
- **V2 Solution**: Multi-index cache (address, token pair, protocol, liquidity)
### 5. Stats Disconnection
- **Root Cause**: Events detected but not reflected in stats
- **Impact**: Monitoring blindness, unclear system health
- **V2 Solution**: Event-driven metrics with guaranteed consistency
## V2 Architecture Principles
### 1. **Fail-Fast with Visibility**
- Reject invalid data immediately at source
- Log all rejections with detailed context
- Never allow garbage data to propagate
### 2. **Single Responsibility**
- One parser per exchange type
- One validator per data type
- One cache per index type
### 3. **Observable by Default**
- Every component emits metrics
- Every operation is logged
- Every error has context
### 4. **Self-Healing**
- Automatic retry with exponential backoff
- Fallback to cache when RPC fails
- Circuit breakers for cascading failures
### 5. **Test-Driven**
- Unit tests for every parser
- Integration tests for full pipeline
- Chaos testing for failure scenarios
## High-Level Component Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Arbitrum Monitor │
│ - WebSocket subscription │
│ - Transaction/receipt buffering │
│ - Rate limiting & connection management │
└───────────────┬─────────────────────────────────────────────┘
├─ Transactions & Receipts
┌─────────────────────────────────────────────────────────────┐
│ Parser Factory │
│ - Route to correct parser based on protocol │
│ - Manage parser lifecycle │
└───────────────┬─────────────────────────────────────────────┘
┌──────────┼──────────┬──────────┬──────────┐
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌────────┐
│Uniswap │ │Uniswap │ │SushiSwap│ │ Camelot │ │ Curve │
│V2 Parser│ │V3 Parser │ │ Parser │ │ Parser │ │ Parser │
└────┬────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └───┬────┘
│ │ │ │ │
└───────────┴────────────┴───────────┴───────────┘
┌────────────────────────────────────────┐
│ Event Validation Layer │
│ - Check zero addresses │
│ - Check zero amounts │
│ - Validate against pool cache │
│ - Log discrepancies │
└────────────┬───────────────────────────┘
┌──────────┴──────────┐
│ │
▼ ▼
┌─────────────┐ ┌──────────────────┐
│ Scanner │ │ Background │
│ (Valid │ │ Validation │
│ Events) │ │ Channel │
└─────────────┘ │ (Audit Trail) │
└──────────────────┘
```
## V2 Directory Structure
```
mev-bot/
├── orig/ # V1 codebase preserved
│ ├── cmd/
│ ├── pkg/
│ ├── internal/
│ └── config/
├── docs/
│ └── planning/ # V2 planning documents
│ ├── 00_V2_MASTER_PLAN.md
│ ├── 01_PARSER_ARCHITECTURE.md
│ ├── 02_VALIDATION_PIPELINE.md
│ ├── 03_POOL_CACHE_SYSTEM.md
│ ├── 04_METRICS_OBSERVABILITY.md
│ ├── 05_DATA_FLOW.md
│ ├── 06_IMPLEMENTATION_PHASES.md
│ └── 07_TASK_BREAKDOWN.md
├── cmd/
│ └── mev-bot/
│ └── main.go # New V2 entry point
├── pkg/
│ ├── parsers/ # NEW: Per-exchange parsers
│ │ ├── factory.go
│ │ ├── interface.go
│ │ ├── uniswap_v2.go
│ │ ├── uniswap_v3.go
│ │ ├── sushiswap.go
│ │ ├── camelot.go
│ │ └── curve.go
│ │
│ ├── validation/ # NEW: Validation pipeline
│ │ ├── validator.go
│ │ ├── rules.go
│ │ ├── background.go
│ │ └── metrics.go
│ │
│ ├── cache/ # NEW: Multi-index cache
│ │ ├── pool_cache.go
│ │ ├── index_by_address.go
│ │ ├── index_by_tokens.go
│ │ ├── index_by_liquidity.go
│ │ └── index_by_protocol.go
│ │
│ ├── discovery/ # Pool discovery system
│ │ ├── scanner.go
│ │ ├── factory_watcher.go
│ │ └── blacklist.go
│ │
│ ├── monitor/ # Arbitrum monitoring
│ │ ├── sequencer.go
│ │ ├── connection.go
│ │ └── rate_limiter.go
│ │
│ ├── events/ # Event types and handling
│ │ ├── types.go
│ │ ├── router.go
│ │ └── processor.go
│ │
│ ├── arbitrage/ # Arbitrage detection
│ │ ├── detector.go
│ │ ├── calculator.go
│ │ └── executor.go
│ │
│ └── observability/ # NEW: Metrics & logging
│ ├── metrics.go
│ ├── logger.go
│ ├── tracing.go
│ └── health.go
├── internal/
│ ├── config/ # Configuration management
│ └── utils/ # Shared utilities
└── tests/
├── unit/ # Unit tests
├── integration/ # Integration tests
└── e2e/ # End-to-end tests
```
## Implementation Phases
### Phase 1: Foundation (Weeks 1-2)
**Goal**: Set up V2 project structure and core interfaces
**Tasks**:
1. Create V2 directory structure
2. Define all interfaces (Parser, Validator, Cache, etc.)
3. Set up logging and metrics infrastructure
4. Create base test framework
5. Implement connection management
### Phase 2: Parser Refactor (Weeks 3-5)
**Goal**: Implement per-exchange parsers with validation
**Tasks**:
1. Create Parser interface and factory
2. Implement UniswapV2 parser with tests
3. Implement UniswapV3 parser with tests
4. Implement SushiSwap parser with tests
5. Implement Camelot parser with tests
6. Implement Curve parser with tests
7. Add strict validation layer
8. Integration testing
### Phase 3: Cache System (Weeks 6-7)
**Goal**: Multi-index pool cache with efficient lookups
**Tasks**:
1. Design cache schema
2. Implement address index
3. Implement token-pair index
4. Implement liquidity ranking index
5. Implement protocol index
6. Add cache persistence
7. Add cache invalidation logic
8. Performance testing
### Phase 4: Validation Pipeline (Weeks 8-9)
**Goal**: Background validation with audit trails
**Tasks**:
1. Create validation channel
2. Implement background validator goroutine
3. Add comparison logic (parsed vs cached)
4. Implement discrepancy logging
5. Create validation metrics
6. Add alerting for validation failures
7. Integration testing
### Phase 5: Migration & Testing (Weeks 10-12)
**Goal**: Migrate from V1 to V2, comprehensive testing
**Tasks**:
1. Create migration path
2. Run parallel systems (V1 and V2)
3. Compare outputs
4. Fix discrepancies
5. Load testing
6. Chaos testing
7. Production deployment
8. Monitoring setup
## Success Metrics
### Parsing Accuracy
- **Zero Address Rate**: < 0.01% (target: 0%)
- **Zero Amount Rate**: < 0.01% (target: 0%)
- **Validation Failure Rate**: < 0.5%
- **Cache Hit Rate**: > 95%
### Performance
- **Parse Time**: < 1ms per event (p99)
- **Cache Lookup**: < 0.1ms (p99)
- **End-to-end Latency**: < 10ms from receipt to scanner
### Reliability
- **Uptime**: > 99.9%
- **Data Discrepancy Rate**: < 0.1%
- **Event Drop Rate**: 0%
### Observability
- **All Events Logged**: 100%
- **All Rejections Logged**: 100%
- **Metrics Coverage**: 100% of components
## Risk Mitigation
### Risk: Breaking Changes During Migration
**Mitigation**:
- Run V1 and V2 in parallel
- Compare outputs
- Gradual rollout with feature flags
### Risk: Performance Degradation
**Mitigation**:
- Comprehensive benchmarking
- Load testing before deployment
- Circuit breakers for cascading failures
### Risk: Incomplete Test Coverage
**Mitigation**:
- TDD approach for all new code
- Minimum 90% test coverage requirement
- Integration and E2E tests mandatory
### Risk: Data Quality Regression
**Mitigation**:
- Continuous validation against Arbiscan
- Alerting on validation failures
- Automated rollback on critical issues
## Next Steps
1. Review and approve this master plan
2. Read detailed component plans in subsequent documents
3. Review task breakdown in `07_TASK_BREAKDOWN.md`
4. Begin Phase 1 implementation
---
**Document Status**: Draft for Review
**Created**: 2025-11-10
**Last Updated**: 2025-11-10
**Version**: 1.0

File diff suppressed because it is too large Load Diff