Files

Administrator 803de231ba feat: create v2-prep branch with comprehensive planning

Restructured project for V2 refactor:

**Structure Changes:**
- Moved all V1 code to orig/ folder (preserved with git mv)
- Created docs/planning/ directory
- Added orig/README_V1.md explaining V1 preservation

**Planning Documents:**
- 00_V2_MASTER_PLAN.md: Complete architecture overview
  - Executive summary of critical V1 issues
  - High-level component architecture diagrams
  - 5-phase implementation roadmap
  - Success metrics and risk mitigation

- 07_TASK_BREAKDOWN.md: Atomic task breakdown
  - 99+ hours of detailed tasks
  - Every task < 2 hours (atomic)
  - Clear dependencies and success criteria
  - Organized by implementation phase

**V2 Key Improvements:**
- Per-exchange parsers (factory pattern)
- Multi-layer strict validation
- Multi-index pool cache
- Background validation pipeline
- Comprehensive observability

**Critical Issues Addressed:**
- Zero address tokens (strict validation + cache enrichment)
- Parsing accuracy (protocol-specific parsers)
- No audit trail (background validation channel)
- Inefficient lookups (multi-index cache)
- Stats disconnection (event-driven metrics)

Next Steps:
1. Review planning documents
2. Begin Phase 1: Foundation (P1-001 through P1-010)
3. Implement parsers in Phase 2
4. Build cache system in Phase 3
5. Add validation pipeline in Phase 4
6. Migrate and test in Phase 5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-10 10:14:26 +01:00

12 KiB

Raw Blame History

MEV Bot V2 - Master Architecture Plan

Executive Summary

V2 represents a complete architectural overhaul addressing critical parsing, validation, and scalability issues identified in V1. The rebuild focuses on:

Zero Tolerance for Invalid Data: Eliminate all zero addresses and zero amounts
Per-Exchange Parser Architecture: Individual parsers for each DEX type
Real-time Validation Pipeline: Background validation with audit trails
Scalable Pool Discovery: Efficient caching and multi-index lookups
Observable System: Comprehensive metrics, logging, and health monitoring

Critical Issues from V1

1. Zero Address/Amount Problems

Root Cause: Parser returns zero addresses when transaction data unavailable
Impact: Invalid events submitted to scanner, wasted computation
V2 Solution: Strict validation at multiple layers + pool cache enrichment

2. Parsing Accuracy Issues

Root Cause: Monolithic parser handling all DEX types generically
Impact: Missing token data, incorrect amounts, protocol-specific edge cases
V2 Solution: Per-exchange parsers with protocol-specific logic

3. No Data Quality Audit Trail

Root Cause: No validation or comparison of parsed data vs cached data
Impact: Silent failures, no visibility into parsing degradation
V2 Solution: Background validation channel with discrepancy logging

4. Inefficient Pool Lookups

Root Cause: Single-index cache (by address only)
Impact: Slow arbitrage path discovery, no ranking by liquidity
V2 Solution: Multi-index cache (address, token pair, protocol, liquidity)

5. Stats Disconnection

Root Cause: Events detected but not reflected in stats
Impact: Monitoring blindness, unclear system health
V2 Solution: Event-driven metrics with guaranteed consistency

V2 Architecture Principles

1. Fail-Fast with Visibility

Reject invalid data immediately at source
Log all rejections with detailed context
Never allow garbage data to propagate

2. Single Responsibility

One parser per exchange type
One validator per data type
One cache per index type

3. Observable by Default

Every component emits metrics
Every operation is logged
Every error has context

4. Self-Healing

Automatic retry with exponential backoff
Fallback to cache when RPC fails
Circuit breakers for cascading failures

5. Test-Driven

Unit tests for every parser
Integration tests for full pipeline
Chaos testing for failure scenarios

High-Level Component Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Arbitrum Monitor                         │
│  - WebSocket subscription                                   │
│  - Transaction/receipt buffering                            │
│  - Rate limiting & connection management                    │
└───────────────┬─────────────────────────────────────────────┘
                │
                ├─ Transactions & Receipts
                │
                ▼
┌─────────────────────────────────────────────────────────────┐
│                  Parser Factory                              │
│  - Route to correct parser based on protocol                │
│  - Manage parser lifecycle                                  │
└───────────────┬─────────────────────────────────────────────┘
                │
     ┌──────────┼──────────┬──────────┬──────────┐
     │          │          │          │          │
     ▼          ▼          ▼          ▼          ▼
┌─────────┐ ┌──────────┐ ┌────────┐ ┌──────────┐ ┌────────┐
│Uniswap  │ │Uniswap   │ │SushiSwap│ │ Camelot  │ │ Curve  │
│V2 Parser│ │V3 Parser │ │ Parser │ │  Parser  │ │ Parser │
└────┬────┘ └────┬─────┘ └───┬────┘ └────┬─────┘ └───┬────┘
     │           │            │           │           │
     └───────────┴────────────┴───────────┴───────────┘
                              │
                              ▼
         ┌────────────────────────────────────────┐
         │        Event Validation Layer          │
         │  - Check zero addresses                │
         │  - Check zero amounts                  │
         │  - Validate against pool cache         │
         │  - Log discrepancies                   │
         └────────────┬───────────────────────────┘
                      │
           ┌──────────┴──────────┐
           │                     │
           ▼                     ▼
    ┌─────────────┐      ┌──────────────────┐
    │   Scanner   │      │ Background       │
    │   (Valid    │      │ Validation       │
    │   Events)   │      │ Channel          │
    └─────────────┘      │ (Audit Trail)    │
                         └──────────────────┘

V2 Directory Structure

mev-bot/
├── orig/                          # V1 codebase preserved
│   ├── cmd/
│   ├── pkg/
│   ├── internal/
│   └── config/
│
├── docs/
│   └── planning/                  # V2 planning documents
│       ├── 00_V2_MASTER_PLAN.md
│       ├── 01_PARSER_ARCHITECTURE.md
│       ├── 02_VALIDATION_PIPELINE.md
│       ├── 03_POOL_CACHE_SYSTEM.md
│       ├── 04_METRICS_OBSERVABILITY.md
│       ├── 05_DATA_FLOW.md
│       ├── 06_IMPLEMENTATION_PHASES.md
│       └── 07_TASK_BREAKDOWN.md
│
├── cmd/
│   └── mev-bot/
│       └── main.go                # New V2 entry point
│
├── pkg/
│   ├── parsers/                   # NEW: Per-exchange parsers
│   │   ├── factory.go
│   │   ├── interface.go
│   │   ├── uniswap_v2.go
│   │   ├── uniswap_v3.go
│   │   ├── sushiswap.go
│   │   ├── camelot.go
│   │   └── curve.go
│   │
│   ├── validation/                # NEW: Validation pipeline
│   │   ├── validator.go
│   │   ├── rules.go
│   │   ├── background.go
│   │   └── metrics.go
│   │
│   ├── cache/                     # NEW: Multi-index cache
│   │   ├── pool_cache.go
│   │   ├── index_by_address.go
│   │   ├── index_by_tokens.go
│   │   ├── index_by_liquidity.go
│   │   └── index_by_protocol.go
│   │
│   ├── discovery/                 # Pool discovery system
│   │   ├── scanner.go
│   │   ├── factory_watcher.go
│   │   └── blacklist.go
│   │
│   ├── monitor/                   # Arbitrum monitoring
│   │   ├── sequencer.go
│   │   ├── connection.go
│   │   └── rate_limiter.go
│   │
│   ├── events/                    # Event types and handling
│   │   ├── types.go
│   │   ├── router.go
│   │   └── processor.go
│   │
│   ├── arbitrage/                 # Arbitrage detection
│   │   ├── detector.go
│   │   ├── calculator.go
│   │   └── executor.go
│   │
│   └── observability/             # NEW: Metrics & logging
│       ├── metrics.go
│       ├── logger.go
│       ├── tracing.go
│       └── health.go
│
├── internal/
│   ├── config/                    # Configuration management
│   └── utils/                     # Shared utilities
│
└── tests/
    ├── unit/                      # Unit tests
    ├── integration/               # Integration tests
    └── e2e/                       # End-to-end tests

Implementation Phases

Phase 1: Foundation (Weeks 1-2)

Goal: Set up V2 project structure and core interfaces

Tasks:

Create V2 directory structure
Define all interfaces (Parser, Validator, Cache, etc.)
Set up logging and metrics infrastructure
Create base test framework
Implement connection management

Phase 2: Parser Refactor (Weeks 3-5)

Goal: Implement per-exchange parsers with validation

Tasks:

Create Parser interface and factory
Implement UniswapV2 parser with tests
Implement UniswapV3 parser with tests
Implement SushiSwap parser with tests
Implement Camelot parser with tests
Implement Curve parser with tests
Add strict validation layer
Integration testing

Phase 3: Cache System (Weeks 6-7)

Goal: Multi-index pool cache with efficient lookups

Tasks:

Design cache schema
Implement address index
Implement token-pair index
Implement liquidity ranking index
Implement protocol index
Add cache persistence
Add cache invalidation logic
Performance testing

Phase 4: Validation Pipeline (Weeks 8-9)

Goal: Background validation with audit trails

Tasks:

Create validation channel
Implement background validator goroutine
Add comparison logic (parsed vs cached)
Implement discrepancy logging
Create validation metrics
Add alerting for validation failures
Integration testing

Phase 5: Migration & Testing (Weeks 10-12)

Goal: Migrate from V1 to V2, comprehensive testing

Tasks:

Create migration path
Run parallel systems (V1 and V2)
Compare outputs
Fix discrepancies
Load testing
Chaos testing
Production deployment
Monitoring setup

Success Metrics

Parsing Accuracy

Zero Address Rate: < 0.01% (target: 0%)
Zero Amount Rate: < 0.01% (target: 0%)
Validation Failure Rate: < 0.5%
Cache Hit Rate: > 95%

Performance

Parse Time: < 1ms per event (p99)
Cache Lookup: < 0.1ms (p99)
End-to-end Latency: < 10ms from receipt to scanner

Reliability

Uptime: > 99.9%
Data Discrepancy Rate: < 0.1%
Event Drop Rate: 0%

Observability

All Events Logged: 100%
All Rejections Logged: 100%
Metrics Coverage: 100% of components

Risk Mitigation

Risk: Breaking Changes During Migration

Mitigation:

Run V1 and V2 in parallel
Compare outputs
Gradual rollout with feature flags

Risk: Performance Degradation

Mitigation:

Comprehensive benchmarking
Load testing before deployment
Circuit breakers for cascading failures

Risk: Incomplete Test Coverage

Mitigation:

TDD approach for all new code
Minimum 90% test coverage requirement
Integration and E2E tests mandatory

Risk: Data Quality Regression

Mitigation:

Continuous validation against Arbiscan
Alerting on validation failures
Automated rollback on critical issues

Next Steps

Review and approve this master plan
Read detailed component plans in subsequent documents
Review task breakdown in 07_TASK_BREAKDOWN.md
Begin Phase 1 implementation

Document Status: Draft for Review Created: 2025-11-10 Last Updated: 2025-11-10 Version: 1.0

12 KiB Raw Blame History

MEV Bot V2 - Master Architecture Plan

Executive Summary

Critical Issues from V1

1. Zero Address/Amount Problems

2. Parsing Accuracy Issues

3. No Data Quality Audit Trail

4. Inefficient Pool Lookups

5. Stats Disconnection

V2 Architecture Principles

1. Fail-Fast with Visibility

2. Single Responsibility

3. Observable by Default

4. Self-Healing

5. Test-Driven

High-Level Component Architecture

V2 Directory Structure

Implementation Phases

Phase 1: Foundation (Weeks 1-2)

Phase 2: Parser Refactor (Weeks 3-5)

Phase 3: Cache System (Weeks 6-7)

Phase 4: Validation Pipeline (Weeks 8-9)

Phase 5: Migration & Testing (Weeks 10-12)

Success Metrics

Parsing Accuracy

Performance

Reliability

Observability

Risk Mitigation

Risk: Breaking Changes During Migration

Risk: Performance Degradation

Risk: Incomplete Test Coverage

Risk: Data Quality Regression

Next Steps

12 KiB

Raw Blame History