mev-beta/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Status: V2 Architecture Planning

This repository is currently in **V2 planning phase**. The V1 codebase has been moved to `orig/` for preservation while V2 architecture is being designed.

**Current State:**
- V1 implementation: `orig/` (frozen for reference)
- V2 planning documents: `docs/planning/`
- Active development: Not yet started (planning phase)

## Repository Structure

```
mev-bot/
├── docs/
│   └── planning/              # V2 architecture and task breakdown
│       ├── 00_V2_MASTER_PLAN.md
│       └── 07_TASK_BREAKDOWN.md
│
└── orig/                      # V1 codebase (preserved)
    ├── cmd/mev-bot/          # V1 application entry point
    ├── pkg/                  # V1 library code
    │   ├── events/           # Event parsing (monolithic)
    │   ├── monitor/          # Arbitrum sequencer monitoring
    │   ├── scanner/          # Arbitrage scanning
    │   ├── arbitrage/        # Arbitrage detection
    │   ├── market/           # Market data management
    │   └── pools/            # Pool discovery
    ├── internal/             # V1 private code
    ├── config/               # V1 configuration
    ├── go.mod                # V1 dependencies
    └── README_V1.md          # V1 documentation
```

## V1 Reference (orig/)

### Building and Running V1
```bash
cd orig/
go build -o ../bin/mev-bot-v1 ./cmd/mev-bot/main.go
../bin/mev-bot-v1 start
```

### V1 Architecture Overview
- **Monolithic parser**: Single parser handling all DEX types
- **Basic validation**: Limited validation of parsed data
- **Single-index cache**: Pool cache by address only
- **Event-driven**: Real-time Arbitrum sequencer monitoring

### Critical V1 Issues (driving V2 refactor)
1. **Zero address tokens**: Parser returns zero addresses when transaction calldata unavailable
2. **Parsing accuracy**: Generic parser misses protocol-specific edge cases
3. **No validation audit trail**: Silent failures, no discrepancy logging
4. **Inefficient lookups**: Single-index cache, no liquidity ranking
5. **Stats disconnection**: Events detected but not reflected in metrics

See `orig/README_V1.md` for complete V1 documentation.

## V2 Architecture Plan

### Key Improvements
1. **Per-exchange parsers**: Individual parsers for UniswapV2, UniswapV3, SushiSwap, Camelot, Curve
2. **Multi-layer validation**: Strict validation at parser, monitor, and scanner layers
3. **Multi-index cache**: Lookups by address, token pair, protocol, and liquidity
4. **Background validation**: Audit trail comparing parsed vs cached data
5. **Observable by default**: Comprehensive metrics, structured logging, health monitoring

### V2 Directory Structure (planned)
```
pkg/
├── parsers/              # Per-exchange parser implementations
│   ├── factory.go       # Parser factory pattern
│   ├── interface.go     # Parser interface definition
│   ├── uniswap_v2.go   # UniswapV2-specific parser
│   ├── uniswap_v3.go   # UniswapV3-specific parser
│   └── ...
├── validation/          # Validation pipeline
│   ├── validator.go    # Event validator
│   ├── rules.go        # Validation rules
│   └── background.go   # Background validation channel
├── cache/               # Multi-index pool cache
│   ├── pool_cache.go
│   ├── index_by_address.go
│   ├── index_by_tokens.go
│   └── index_by_liquidity.go
└── observability/       # Metrics and logging
    ├── metrics.go
    └── logger.go
```

### Implementation Roadmap
See `docs/planning/07_TASK_BREAKDOWN.md` for detailed atomic tasks (~99 hours total):

- **Phase 1: Foundation** (11 hours) - Interfaces, logging, metrics
- **Phase 2: Parser Refactor** (45 hours) - Per-exchange parsers
- **Phase 3: Cache System** (16 hours) - Multi-index cache
- **Phase 4: Validation Pipeline** (13 hours) - Background validation
- **Phase 5: Migration & Testing** (14 hours) - V1/V2 comparison

## Development Workflow

### V1 Commands (reference only)
```bash
cd orig/

# Build
make build

# Run tests
make test

# Run V1 bot
./bin/mev-bot start

# View logs
./scripts/log-manager.sh analyze
```

### V2 Development (when started)

**DO NOT** start V2 implementation without:
1. Reviewing `docs/planning/00_V2_MASTER_PLAN.md`
2. Reviewing `docs/planning/07_TASK_BREAKDOWN.md`
3. Creating task branch from `feature/v2-prep`
4. Following atomic task breakdown

## Key Principles for V2 Development

### 1. Fail-Fast with Visibility
- Reject invalid data immediately at source
- Log all rejections with detailed context
- Never allow garbage data to propagate downstream

### 2. Single Responsibility
- One parser per exchange type
- One validator per data type
- One cache per index type

### 3. Observable by Default
- Every component emits metrics
- Every operation is logged with context
- Every error includes stack trace and state

### 4. Test-Driven
- Unit tests for every parser (>90% coverage)
- Integration tests for full pipeline
- Chaos testing for failure scenarios

### 5. Atomic Tasks
- Each task < 2 hours (from 07_TASK_BREAKDOWN.md)
- Clear dependencies between tasks
- Testable success criteria

## Architecture Patterns Used

### V1 (orig/)
- **Monolithic parser**: Single `EventParser` handling all protocols
- **Pipeline pattern**: Multi-stage processing with worker pools
- **Event-driven**: WebSocket subscription to Arbitrum sequencer
- **Connection pooling**: RPC connection management with failover

### V2 (planned)
- **Factory pattern**: Parser factory routes to protocol-specific parsers
- **Strategy pattern**: Per-exchange parsing strategies
- **Observer pattern**: Background validation observes all parsed events
- **Multi-index pattern**: Multiple indexes over same pool data
- **Circuit breaker**: Automatic failover on cascading failures

## Common Development Tasks

### Analyzing V1 Code
```bash
# Find monolithic parser
cat orig/pkg/events/parser.go

# Review arbitrage detection
cat orig/pkg/arbitrage/detection_engine.go

# Understand pool cache
cat orig/pkg/pools/discovery.go
```

### Creating V2 Components
Follow task breakdown in `docs/planning/07_TASK_BREAKDOWN.md`:

**Example: Creating UniswapV2 Parser (P2-002 through P2-009)**
1. Create `pkg/parsers/uniswap_v2.go`
2. Define struct with logger and cache dependencies
3. Implement `ParseLog()` for Swap events
4. Implement token extraction from pool cache
5. Implement validation rules
6. Add Mint/Burn event support
7. Implement `ParseReceipt()` for multi-event handling
8. Write comprehensive unit tests
9. Integration test with real Arbiscan data

### Testing Strategy
```bash
# Unit tests (when V2 implementation starts)
go test ./pkg/parsers/... -v

# Integration tests
go test ./tests/integration/... -v

# Benchmark parsers
go test ./pkg/parsers/... -bench=. -benchmem

# Load testing
go test ./tests/load/... -v
```

## Git Workflow

### Branch Strategy (STRICTLY ENFORCED)

**ALL V2 development MUST use feature branches:**

```bash
# Branch naming convention (REQUIRED)
feature/v2/<component>/<task-id>-<description>

# Examples:
feature/v2/parsers/P2-002-uniswap-v2-base
feature/v2/cache/P3-001-address-index
feature/v2/validation/P4-001-validation-rules
```

**Branch Rules:**
1. ✅ **ALWAYS** create feature branch from `feature/v2-prep`
2. ✅ **NEVER** commit directly to `feature/v2-prep` or `master-dev`
3. ✅ Branch name MUST match task ID from `07_TASK_BREAKDOWN.md`
4. ✅ One branch per atomic task (< 2 hours work)
5. ✅ Delete branch after merge

**Example Workflow:**
```bash
# 1. Create feature branch
git checkout feature/v2-prep
git pull origin feature/v2-prep
git checkout -b feature/v2/parsers/P2-002-uniswap-v2-base

# 2. Implement task P2-002
# ... make changes ...

# 3. Test with 100% coverage (REQUIRED)
go test ./pkg/parsers/uniswap_v2/... -coverprofile=coverage.out
# MUST show 100% coverage

# 4. Commit
git add .
git commit -m "feat(parsers): implement UniswapV2 parser base structure

- Created UniswapV2Parser struct with dependencies
- Implemented constructor with logger and cache injection
- Stubbed all interface methods
- Added 100% test coverage

Task: P2-002
Coverage: 100%
Tests: 15/15 passing

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"

# 5. Push and create PR
git push -u origin feature/v2/parsers/P2-002-uniswap-v2-base

# 6. After merge, delete branch
git branch -d feature/v2/parsers/P2-002-uniswap-v2-base
```

### Commit Message Format
```
type(scope): brief description

- Detailed changes
- Why the change was needed
- Breaking changes or migration notes

Task: [TASK-ID from 07_TASK_BREAKDOWN.md]
Coverage: [100% REQUIRED]
Tests: [X/X passing - MUST be 100%]

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
```

**Types**: `feat`, `fix`, `perf`, `refactor`, `test`, `docs`, `build`, `ci`

**Examples:**
```bash
# Good commit
feat(parsers): implement UniswapV3 swap parsing

- Added ParseSwapEvent for V3 with signed amounts
- Implemented decimal scaling for token precision
- Added validation for sqrtPriceX96 and liquidity

Task: P2-011
Coverage: 100%
Tests: 23/23 passing

# Bad commit (missing task ID, coverage info)
fix: parser bug
```

## Important Notes

### What NOT to Do
- ❌ Modify V1 code in `orig/` (except for critical bugs)
- ❌ Start V2 implementation without reviewing planning docs
- ❌ Skip atomic task breakdown from `07_TASK_BREAKDOWN.md`
- ❌ Implement workarounds instead of fixing root causes
- ❌ Allow zero addresses or zero amounts to propagate

### What TO Do
- ✅ Read `docs/planning/00_V2_MASTER_PLAN.md` before starting
- ✅ Follow task breakdown in `07_TASK_BREAKDOWN.md`
- ✅ Write tests before implementation (TDD)
- ✅ Use strict validation at all layers
- ✅ Add comprehensive logging and metrics
- ✅ Fix root causes, not symptoms

## Key Files to Review

### Planning Documents
- `docs/planning/00_V2_MASTER_PLAN.md` - Complete V2 architecture
- `docs/planning/07_TASK_BREAKDOWN.md` - Atomic task list (99+ hours)
- `orig/README_V1.md` - V1 documentation and known issues

### V1 Reference Implementation
- `orig/pkg/events/parser.go` - Monolithic parser (to be replaced)
- `orig/pkg/monitor/concurrent.go` - Arbitrum monitor (to be enhanced)
- `orig/pkg/pools/discovery.go` - Pool discovery (cache to be multi-indexed)
- `orig/pkg/arbitrage/detection_engine.go` - Arbitrage detection (to be improved)

## Contact and Resources

- V2 Planning: `docs/planning/`
- V1 Reference: `orig/`
- Architecture diagrams: In `00_V2_MASTER_PLAN.md`
- Task breakdown: In `07_TASK_BREAKDOWN.md`

---

**Current Phase**: V2 Planning
**Next Step**: Begin Phase 1 implementation (Foundation)
**Estimated Time**: 12-13 weeks for complete V2 implementation