mev-beta/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Status: V2 Foundation Complete ✅

This repository has completed **V2 foundation implementation** with 100% test coverage. The V1 codebase has been moved to `orig/` for preservation.

**Current State:**
- V1 implementation: `orig/` (frozen for reference)
- V2 planning documents: `docs/planning/` (complete)
- V2 foundation: `pkg/` (✅ complete with 100% test coverage)
- Active development: Ready for protocol parser implementations

## Repository Structure

```
mev-bot/
├── docs/
│   └── planning/              # V2 architecture and task breakdown
│       ├── 00_V2_MASTER_PLAN.md
│       └── 07_TASK_BREAKDOWN.md
│
└── orig/                      # V1 codebase (preserved)
    ├── cmd/mev-bot/          # V1 application entry point
    ├── pkg/                  # V1 library code
    │   ├── events/           # Event parsing (monolithic)
    │   ├── monitor/          # Arbitrum sequencer monitoring
    │   ├── scanner/          # Arbitrage scanning
    │   ├── arbitrage/        # Arbitrage detection
    │   ├── market/           # Market data management
    │   └── pools/            # Pool discovery
    ├── internal/             # V1 private code
    ├── config/               # V1 configuration
    ├── go.mod                # V1 dependencies
    └── README_V1.md          # V1 documentation
```

## V1 Reference (orig/)

### Building and Running V1
```bash
cd orig/
go build -o ../bin/mev-bot-v1 ./cmd/mev-bot/main.go
../bin/mev-bot-v1 start
```

### V1 Architecture Overview
- **Monolithic parser**: Single parser handling all DEX types
- **Basic validation**: Limited validation of parsed data
- **Single-index cache**: Pool cache by address only
- **Event-driven**: Real-time Arbitrum sequencer monitoring

### Critical V1 Issues (driving V2 refactor)
1. **Zero address tokens**: Parser returns zero addresses when transaction calldata unavailable
2. **Parsing accuracy**: Generic parser misses protocol-specific edge cases
3. **No validation audit trail**: Silent failures, no discrepancy logging
4. **Inefficient lookups**: Single-index cache, no liquidity ranking
5. **Stats disconnection**: Events detected but not reflected in metrics

See `orig/README_V1.md` for complete V1 documentation.

## V2 Architecture Plan

### Key Improvements
1. **Per-exchange parsers**: Individual parsers for UniswapV2, UniswapV3, SushiSwap, Camelot, Curve
2. **Multi-layer validation**: Strict validation at parser, monitor, and scanner layers
3. **Multi-index cache**: Lookups by address, token pair, protocol, and liquidity
4. **Background validation**: Audit trail comparing parsed vs cached data
5. **Observable by default**: Comprehensive metrics, structured logging, health monitoring

### V2 Directory Structure (planned)
```
pkg/
├── parsers/              # Per-exchange parser implementations
│   ├── factory.go       # Parser factory pattern
│   ├── interface.go     # Parser interface definition
│   ├── uniswap_v2.go   # UniswapV2-specific parser
│   ├── uniswap_v3.go   # UniswapV3-specific parser
│   └── ...
├── validation/          # Validation pipeline
│   ├── validator.go    # Event validator
│   ├── rules.go        # Validation rules
│   └── background.go   # Background validation channel
├── cache/               # Multi-index pool cache
│   ├── pool_cache.go
│   ├── index_by_address.go
│   ├── index_by_tokens.go
│   └── index_by_liquidity.go
└── observability/       # Metrics and logging
    ├── metrics.go
    └── logger.go
```

### Implementation Roadmap
See `docs/planning/07_TASK_BREAKDOWN.md` for detailed atomic tasks (~99 hours total):

- **Phase 1: Foundation** (11 hours) - Interfaces, logging, metrics
- **Phase 2: Parser Refactor** (45 hours) - Per-exchange parsers
- **Phase 3: Cache System** (16 hours) - Multi-index cache
- **Phase 4: Validation Pipeline** (13 hours) - Background validation
- **Phase 5: Migration & Testing** (14 hours) - V1/V2 comparison

## Development Workflow

### V1 Commands (reference only)
```bash
cd orig/

# Build
make build

# Run tests
make test

# Run V1 bot
./bin/mev-bot start

# View logs
./scripts/log-manager.sh analyze
```

### V2 Development (when started)

**DO NOT** start V2 implementation without:
1. Reviewing `docs/planning/00_V2_MASTER_PLAN.md`
2. Reviewing `docs/planning/07_TASK_BREAKDOWN.md`
3. Creating task branch from `feature/v2-prep`
4. Following atomic task breakdown

## Key Principles for V2 Development

### 1. Fail-Fast with Visibility
- Reject invalid data immediately at source
- Log all rejections with detailed context
- Never allow garbage data to propagate downstream

### 2. Single Responsibility
- One parser per exchange type
- One validator per data type
- One cache per index type

### 3. Observable by Default
- Every component emits metrics
- Every operation is logged with context
- Every error includes stack trace and state

### 4. Test-Driven
- Unit tests for every parser (>90% coverage)
- Integration tests for full pipeline
- Chaos testing for failure scenarios

### 5. Atomic Tasks
- Each task < 2 hours (from 07_TASK_BREAKDOWN.md)
- Clear dependencies between tasks
- Testable success criteria

## Architecture Patterns Used

### V1 (orig/)
- **Monolithic parser**: Single `EventParser` handling all protocols
- **Pipeline pattern**: Multi-stage processing with worker pools
- **Event-driven**: WebSocket subscription to Arbitrum sequencer
- **Connection pooling**: RPC connection management with failover

### V2 (planned)
- **Factory pattern**: Parser factory routes to protocol-specific parsers
- **Strategy pattern**: Per-exchange parsing strategies
- **Observer pattern**: Background validation observes all parsed events
- **Multi-index pattern**: Multiple indexes over same pool data
- **Circuit breaker**: Automatic failover on cascading failures

## Common Development Tasks

### Analyzing V1 Code
```bash
# Find monolithic parser
cat orig/pkg/events/parser.go

# Review arbitrage detection
cat orig/pkg/arbitrage/detection_engine.go

# Understand pool cache
cat orig/pkg/pools/discovery.go
```

### Creating V2 Components
Follow task breakdown in `docs/planning/07_TASK_BREAKDOWN.md`:

**Example: Creating UniswapV2 Parser (P2-002 through P2-009)**
1. Create `pkg/parsers/uniswap_v2.go`
2. Define struct with logger and cache dependencies
3. Implement `ParseLog()` for Swap events
4. Implement token extraction from pool cache
5. Implement validation rules
6. Add Mint/Burn event support
7. Implement `ParseReceipt()` for multi-event handling
8. Write comprehensive unit tests
9. Integration test with real Arbiscan data

### Testing Strategy
```bash
# Unit tests (when V2 implementation starts)
go test ./pkg/parsers/... -v

# Integration tests
go test ./tests/integration/... -v

# Benchmark parsers
go test ./pkg/parsers/... -bench=. -benchmem

# Load testing
go test ./tests/load/... -v
```

## Git Workflow

### Branch Structure

**V2 Production & Development Branches:**

```
v2-master         # Production branch for V2 (protected)
v2-master-dev     # Development branch for V2 (protected)
feature/v2-prep   # Planning and foundation (archived)
feature/v2/*      # Feature branches for development
```

**Branch Hierarchy:**
```
v2-master (production)
  ↑
  └── v2-master-dev (development)
        ↑
        └── feature/v2/* (feature branches)
```

### Branch Strategy (STRICTLY ENFORCED)

**ALL V2 development MUST use feature branches:**

```bash
# Branch naming convention (REQUIRED)
feature/v2/<component>/<task-id>-<description>

# Examples:
feature/v2/parsers/P2-002-uniswap-v2-base
feature/v2/cache/P3-001-address-index
feature/v2/arbitrage/P5-001-path-finder
```

**Branch Rules:**
1. ✅ **ALWAYS** create feature branch from `v2-master-dev`
2. ✅ **NEVER** commit directly to `v2-master` or `v2-master-dev`
3. ✅ Branch name MUST match task ID from `07_TASK_BREAKDOWN.md`
4. ✅ One branch per atomic task (< 2 hours work)
5. ✅ Delete branch after merge
6. ✅ Merge feature → v2-master-dev → v2-master

**Example Workflow:**
```bash
# 1. Create feature branch from v2-master-dev
git checkout v2-master-dev
git pull origin v2-master-dev
git checkout -b feature/v2/parsers/P2-002-uniswap-v2-base

# 2. Implement task P2-002
# ... make changes ...

# 3. Test with 100% coverage (REQUIRED)
make test-coverage
# MUST show 100% coverage or CI/CD will fail

# 4. Run full validation locally
make validate
# All checks must pass

# 5. Commit with conventional format
git add .
git commit -m "feat(parsers): implement UniswapV2 parser base structure

- Created UniswapV2Parser struct with dependencies
- Implemented constructor with logger and cache injection
- Stubbed all interface methods
- Added 100% test coverage

Task: P2-002
Coverage: 100%
Tests: 15/15 passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"

# 6. Push and create PR to v2-master-dev
git push -u origin feature/v2/parsers/P2-002-uniswap-v2-base

# 7. Create PR on GitHub targeting v2-master-dev
# Wait for CI/CD to pass (100% coverage enforced)

# 8. Merge PR and delete feature branch
git checkout v2-master-dev
git pull origin v2-master-dev
git branch -d feature/v2/parsers/P2-002-uniswap-v2-base

# 9. When ready for production release
git checkout v2-master
git merge --no-ff v2-master-dev
git push origin v2-master
```

### Commit Message Format
```
type(scope): brief description

- Detailed changes
- Why the change was needed
- Breaking changes or migration notes

Task: [TASK-ID from 07_TASK_BREAKDOWN.md]
Coverage: [100% REQUIRED]
Tests: [X/X passing - MUST be 100%]

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
```

**Types**: `feat`, `fix`, `perf`, `refactor`, `test`, `docs`, `build`, `ci`

**Examples:**
```bash
# Good commit
feat(parsers): implement UniswapV3 swap parsing

- Added ParseSwapEvent for V3 with signed amounts
- Implemented decimal scaling for token precision
- Added validation for sqrtPriceX96 and liquidity

Task: P2-011
Coverage: 100%
Tests: 23/23 passing

# Bad commit (missing task ID, coverage info)
fix: parser bug
```

## Important Notes

### What NOT to Do
- ❌ Modify V1 code in `orig/` (except for critical bugs)
- ❌ Start V2 implementation without reviewing planning docs
- ❌ Skip atomic task breakdown from `07_TASK_BREAKDOWN.md`
- ❌ Implement workarounds instead of fixing root causes
- ❌ Allow zero addresses or zero amounts to propagate

### What TO Do
- ✅ Review foundation implementation in `pkg/` before adding parsers
- ✅ Follow task breakdown in `07_TASK_BREAKDOWN.md`
- ✅ Write tests before implementation (TDD - MANDATORY)
- ✅ Use strict validation at all layers
- ✅ Add comprehensive logging and metrics
- ✅ Fix root causes, not symptoms
- ✅ Always create feature branches from `v2-master-dev`
- ✅ Run `make validate` before pushing (100% coverage enforced)

## Key Files to Review

### Planning Documents (Complete ✅)
- `docs/planning/00_V2_MASTER_PLAN.md` - Complete V2 architecture
- `docs/planning/01_MODULARITY_REQUIREMENTS.md` - Component independence
- `docs/planning/02_PROTOCOL_SUPPORT_REQUIREMENTS.md` - 13+ DEX protocols
- `docs/planning/03_TESTING_REQUIREMENTS.md` - 100% coverage enforcement
- `docs/planning/04_PROFITABILITY_PLAN.md` - Sequencer strategy & ROI
- `docs/planning/05_CI_CD_SETUP.md` - Complete pipeline documentation
- `docs/planning/07_TASK_BREAKDOWN.md` - Atomic task list (99+ hours)

### V2 Foundation (Complete ✅ - 100% Coverage)
- `pkg/types/` - Core types (SwapEvent, PoolInfo, errors)
- `pkg/parsers/` - Parser factory with routing
- `pkg/cache/` - Multi-index pool cache (O(1) lookups)
- `pkg/validation/` - Validation pipeline with rules
- `pkg/observability/` - Logging & Prometheus metrics

### V1 Reference Implementation
- `orig/pkg/events/parser.go` - Monolithic parser (reference)
- `orig/pkg/monitor/concurrent.go` - Arbitrum monitor (reference)
- `orig/pkg/pools/discovery.go` - Pool discovery (reference)
- `orig/pkg/arbitrage/detection_engine.go` - Arbitrage detection (reference)

### Documentation
- `README.md` - Project overview and quick start
- `CLAUDE.md` - This file (project guidance)
- `docs/V2_IMPLEMENTATION_STATUS.md` - Implementation progress
- `Makefile` - Build automation commands
- `.golangci.yml` - Linter configuration (40+ linters)

## Current Branches

### Production & Development
- **v2-master** - Production branch for V2 (protected, stable)
- **v2-master-dev** - Development branch for V2 (protected, stable)
- **feature/v2-prep** - Planning and foundation (archived, complete)

### Active Development Pattern
```bash
# Always branch from v2-master-dev
git checkout v2-master-dev
git pull origin v2-master-dev
git checkout -b feature/v2/<component>/<task-id>-<description>

# Merge back to v2-master-dev via PR
# Merge v2-master-dev to v2-master when ready for production
```

## Contact and Resources

- **V2 Foundation:** `pkg/` (✅ Complete - 100% coverage)
- **V2 Planning:** `docs/planning/` (✅ Complete - 7 documents)
- **V1 Reference:** `orig/` (Frozen for reference)
- **CI/CD:** `.github/workflows/v2-ci.yml` (✅ Configured)
- **Build Tools:** `Makefile` (✅ Ready - `make validate`)
- **Git Hooks:** `.git-hooks/` (✅ Install with `./scripts/install-git-hooks.sh`)

---

**Current Phase:** ✅ V2 Foundation Complete (100% Coverage)
**Next Step:** Phase 2 - Protocol Parser Implementations
**Foundation:** 3,300+ lines of code (1,500 implementation, 1,800 tests)
**Status:** Production-ready infrastructure, ready for parsers