Files

Administrator c54c569f30 refactor: move all remaining files to orig/ directory

Completed clean root directory structure:
- Root now contains only: .git, .env, docs/, orig/
- Moved all remaining files and directories to orig/:
  - Config files (.claude, .dockerignore, .drone.yml, etc.)
  - All .env variants (except active .env)
  - Git config (.gitconfig, .github, .gitignore, etc.)
  - Tool configs (.golangci.yml, .revive.toml, etc.)
  - Documentation (*.md files, @prompts)
  - Build files (Dockerfiles, Makefile, go.mod, go.sum)
  - Docker compose files
  - All source directories (scripts, tests, tools, etc.)
  - Runtime directories (logs, monitoring, reports)
  - Dependency files (node_modules, lib, cache)
  - Special files (--delete)

- Removed empty runtime directories (bin/, data/)

V2 structure is now clean:
- docs/planning/ - V2 planning documents
- orig/ - Complete V1 codebase preserved
- .env - Active environment config (not in git)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-10 10:53:05 +01:00

25 KiB

Raw Blame History

MEV Bot Project Specification

Version: 2.0 (November 2025 - Current State) Language: Go 1.24+ Target Chain: Arbitrum (Layer 2) Module: github.com/fraktal/mev-beta Codebase: 362 Go files, ~100,000+ lines of code

🎯 Project Overview

The MEV Bot is a production-grade arbitrage detection and analysis system for the Arbitrum network. It monitors decentralized exchanges (DEXs) in real-time using an event-driven architecture to identify profitable arbitrage opportunities across multiple protocols.

Core Capabilities

Real-time Arbitrum Monitoring with sub-second latency via event-driven processing
Multi-Protocol Support for Uniswap V2/V3, SushiSwap, Curve, Balancer, and more
Advanced Transaction Parsing with sophisticated ABI decoding for complex multicalls
Three-Pool RPC Architecture separating read-only, execution, and testing workloads
Worker Pool Processing for concurrent event analysis (100+ events/sec capacity)
Secure Key Management with AES-256-GCM encryption and hardware wallet support
Production Logging System with health scoring, analytics, and automated archival

🏗️ System Architecture

Layered Architecture (5 Layers)

┌─────────────────────────────────────────────────────────────┐
│  Layer 1: Smart Contract Layer                              │
│  - Arbitrage executor contracts (bindings/)                │
│  - Flash swap executors                                     │
│  - Token and pool interfaces                                │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│  Layer 2: Execution Layer                                   │
│  - ArbitrageExecutor (pkg/arbitrage/executor.go - 1,641 LOC)│
│  - FlashSwapExecutor (pkg/arbitrage/flash_executor.go)     │
│  - LiveExecutionFramework (real-time execution)             │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│  Layer 3: Detection & Analysis Layer                        │
│  - ArbitrageDetectionEngine (opportunity discovery)         │
│  - MultiHopScanner (multi-hop path finding)                 │
│  - Scanner with worker pools (event processing)             │
│  - DEX protocol implementations (6 protocols)               │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│  Layer 4: Event Collection & Parsing Layer                  │
│  - ArbitrumMonitor (sequencer monitoring - 1,351 LOC)       │
│  - L2Parser (transaction parsing - 1,985 LOC)               │
│  - AbiDecoder (multicall decoding - 1,116 LOC)              │
│  - EventParser (log parsing - 1,806 LOC)                    │
└─────────────────────────────────────────────────────────────┘
                           ↓
┌─────────────────────────────────────────────────────────────┐
│  Layer 5: Infrastructure Layer                              │
│  - UnifiedProviderManager (3-pool RPC architecture)         │
│  - PoolDiscovery (cache-based pool management)              │
│  - KeyManager (secure signing - 1,841 LOC)                  │
│  - RateLimiter (per-endpoint limiting)                      │
└─────────────────────────────────────────────────────────────┘

Three-Pool RPC Architecture

The system uses three separate RPC endpoint pools for optimal performance:

UnifiedProviderManager
├─ ReadOnlyPool (50 RPS max)
│  └─ Used for: getBalance, call, getLogs, getCode
│  └─ High throughput for read-heavy operations
│
├─ ExecutionPool (20 RPS max)
│  └─ Used for: sendTransaction
│  └─ Reliable endpoints with lower limits
│
└─ TestingPool (10 RPS max)
   └─ Used for: simulation, callStatic
   └─ Isolated from production workload

Benefits:

Execution transactions never rate-limited by read operations
Independent failover per pool
Optimized rate limits per endpoint capability
Health checks and automatic endpoint rotation

📊 Core Components

1. Arbitrage Service (`pkg/arbitrage/` - 17 files, 7,000+ LOC)

Primary Components:

ArbitrageService (service.go - 1,995 LOC) - Main orchestration service
ArbitrageExecutor (executor.go - 1,641 LOC) - Transaction execution
FlashSwapExecutor (flash_executor.go - 1,462 LOC) - Flash swap logic
MultiHopScanner (multihop.go - 892 LOC) - Multi-hop path detection
DetectionEngine (detection_engine.go - 953 LOC) - Opportunity discovery
LiveExecutionFramework (1,005 LOC) - Real-time execution
NonceManager (3,843 LOC) - Transaction nonce management
Database (13,129 LOC) - Opportunity persistence

Key Features:

Event-driven arbitrage detection
Multi-hop route optimization
Gas-aware profit calculation
Confidence scoring and risk assessment
Real-time opportunity ranking

2. Arbitrum Integration (`pkg/arbitrum/` - 34 files, 8,000+ LOC)

Primary Components:

L2Parser (l2_parser.go - 1,985 LOC) - Advanced transaction parsing
AbiDecoder (abi_decoder.go - 1,116 LOC) - Multicall decoding
Parser (parser.go - 967 LOC) - Basic transaction parsing
ConnectionManager (connection.go - 266 LOC) - RPC management
SwapPipeline (swap_pipeline.go - 844 LOC) - Swap processing
EventMonitor (event_monitor.go - 658 LOC) - Event monitoring

Capabilities:

Handles complex multicall transactions
Supports 10+ DEX router patterns
Extracts token addresses and swap amounts
~90% parsing success rate on production data
Graceful fallback for unknown patterns

3. Market Monitoring (`pkg/monitor/` - 1,351 LOC)

ArbitrumMonitor:

WebSocket subscription to Arbitrum sequencer
High-throughput transaction processing (50,000 buffer)
Automatic RPC failover and health monitoring
Rate limiting and connection management
Feeds parsed transactions to scanner

Performance:

Processing: ~3-4 blocks/second sustained
Latency: Sub-second block processing
Uptime: 27+ minutes continuous (validated)

4. Scanner System (`pkg/scanner/` - 5 subdirectories)

Architecture:

Scanner (concurrent.go)
├─ Worker Pool Pattern
│  ├─ Configurable worker count (4-8 default)
│  ├─ Non-blocking channel communication
│  └─ Graceful shutdown with WaitGroup
│
├─ MarketScanner (market/)
│  └─ Token pair and pool analysis
│
├─ SwapAnalyzer (swap/)
│  └─ Swap event detection and analysis
│
└─ LiquidityAnalyzer (analysis/)
   └─ Liquidity change calculations

Performance:

Throughput: 100+ events/second with 4-8 workers
Latency: ~10-50ms per event analysis
Concurrency: Independent worker processing

5. DEX Protocol Support (`pkg/dex/` - 11 files)

Protocol	Implementation	Fee Structure	Math Type
Uniswap V3	uniswap_v3.go	0.05%-1%	Concentrated liquidity, tick-based
Uniswap V2	dex/	0.3%	Constant product (x×y=k)
SushiSwap	sushiswap.go	0.3%	V2-compatible
Curve	curve.go	0.04%	StableSwap invariant
Balancer	balancer.go	0.3%	Weighted pool formula
1inch	(referenced)	Variable	Aggregator support

Protocol-Specific Features:

V3: Tick-based price ranges with sqrt price math
V2: Classic AMM formula with fee deduction
Curve: Low-slippage stablecoin swaps
Balancer: Multi-token weighted pools

6. Security & Key Management (`pkg/security/` - 11 files, 5,000+ LOC)

Components:

KeyManager (keymanager.go - 1,841 LOC) - Secure key generation, storage, signing
RateLimiter (rate_limiter.go - 1,411 LOC) - DoS protection
AuditAnalyzer (audit_analyzer.go - 1,646 LOC) - Audit logging
PerformanceProfiler (1,316 LOC) - Performance metrics
AnomalyDetector (1,069 LOC) - Suspicious activity detection

Security Features:

AES-256-GCM encryption for private keys
Hardware wallet support
Automatic key rotation
Comprehensive audit logging
Rate limiting at multiple levels

🔄 Data Flow & Processing Pipeline

Complete Processing Flow

1. Arbitrum Block Stream (WebSocket)
        ↓
2. ArbitrumMonitor.Start()
   - Subscribes to new blocks
   - Fetches block transactions
        ↓
3. L2Parser.ParseTransaction()
   - Decodes multicall with AbiDecoder
   - Extracts function calls
   - Identifies swap operations
        ↓
4. EventParser.ParseEvents()
   - Decodes transaction receipt logs
   - Extracts swap/liquidity events
   - Parses pool state changes
        ↓
5. Scanner.ProcessEvents()
   - Dispatches to worker pool
   - MarketScanner analyzes token pairs
   - SwapAnalyzer detects arbitrage patterns
   - LiquidityAnalyzer calculates impacts
        ↓
6. ArbitrageService monitors results
   - MultiHopScanner finds optimal paths
   - DetectionEngine ranks opportunities
   - Filters by confidence and profitability
        ↓
7. ArbitrageExecutor.ExecuteArbitrage()
   - Simulates transaction
   - Estimates gas costs
   - Validates profitability
   - Signs with KeyManager
   - Submits to Arbitrum
        ↓
8. Results logged and persisted

Performance Characteristics

Latency Breakdown (Block → Detection):

1. Receive block:              ~1ms
2. Fetch transaction:          ~50-100ms (RPC)
3. Fetch receipt:              ~50-100ms (RPC)
4. Parse transaction (ABI):    ~10-50ms (CPU)
5. Parse events:               ~5-20ms (CPU)
6. Analyze events (scanner):   ~10-50ms (CPU)
7. Detect arbitrage:           ~20-100ms (CPU + RPC)
─────────────────────────────────────────────
Total: ~150-450ms from block to detection

Observation: RPC calls dominate latency, not CPU processing.

⚙️ Configuration Management

Configuration Hierarchy

1. YAML Configuration Files (Base)
   ├─ config/arbitrum_production.yaml  (tokens, DEX configs)
   ├─ config/providers.yaml            (RPC endpoint pools)
   └─ config/providers_runtime.yaml    (runtime overrides)

2. Environment Variables (Override)
   ├─ GO_ENV (development|staging|production)
   ├─ MEV_BOT_ENCRYPTION_KEY (required)
   ├─ ARBITRUM_RPC_ENDPOINT
   ├─ ARBITRUM_WS_ENDPOINT
   └─ LOG_LEVEL, DEBUG, METRICS_ENABLED

3. Runtime Configuration (Programmatic)
   ├─ Per-endpoint overrides
   └─ Dynamic endpoint switching

Production Configuration Example

config/arbitrum_production.yaml:

tokens:
  weth:
    address: "0x82aF49447D8a07e3bd95BD0d56f35241523fBab1"
    decimals: 18
    coingecko_id: "weth"
  usdc:
    address: "0xaf88d065e77c8cC2239327C5EDb3A432268e5831"
    decimals: 6
    is_stable: true
  # 20+ major tokens defined

dex_configs:
  uniswap_v3:
    factory: "0x1F98431c8aD98523631AE4a59f267346ea31F984"
    router: "0xE592427A0AEce92De3Edee1F18E0157C05861564"
    fee_tiers: [500, 3000, 10000]

arbitrage:
  min_profit_threshold: "0.001"  # 0.1%
  max_slippage: "0.005"          # 0.5%
  max_gas_price: "50000000000"   # 50 gwei
  max_position_size: "100000000000000000000"  # 100 ETH

config/providers.yaml:

read_only_pool:
  endpoints:
    - url: "https://arbitrum-mainnet.core.chainstack.com/..."
      name: "chainstack-primary"
      priority: 1
      max_rps: 50
      timeout: "10s"
    - url: "https://arb1.arbitrum.io/rpc"
      name: "arbitrum-public"
      priority: 2
      max_rps: 30

execution_pool:
  endpoints:
    - url: "https://arbitrum-mainnet.core.chainstack.com/..."
      priority: 1
      max_rps: 20

testing_pool:
  endpoints:
    - url: "https://arbitrum-mainnet.core.chainstack.com/..."
      priority: 1
      max_rps: 10

📈 Production Status & Performance

Current Implementation Status

✅ Production Ready:

Real-time transaction parsing (~90% success rate)
Event processing (100+ events/sec)
Multi-protocol support (6 DEX protocols)
Rate limiting and failover
Secure key management
Production logging with health scoring

⚠️ Partially Disabled (Workarounds Active):

Pool discovery background task (uses cache-only, 314 pools loaded)
Security manager (KeyManager works independently)

❌ Not Implemented:

MEV protection (Flashbots, MEV-Share)
Multi-chain support (Arbitrum only)
Persistent opportunity database
Machine learning-based detection

Performance Metrics (Validated)

Metric	Value	Source
Startup Time	~30 seconds	With pool cache
Event Processing	100+ events/sec	Worker pool capacity
Detection Latency	150-450ms	Block to opportunity
Memory Baseline	~200MB	Pool cache + state
Memory Peak	~500MB	Full operation
Health Score	97.97/100	Log analytics system
Error Rate	2.03%	Log analysis
Parsing Success	~90%	Transaction decoding
Uptime	27+ minutes	Validated continuous

System Requirements

Minimum:

CPU: 2+ cores for concurrent processing
RAM: 4GB+ for transaction buffering
Network: Stable WebSocket connection
Storage: 10GB+ for logs

Recommended:

CPU: 4+ cores for optimal worker pools
RAM: 8GB+ for larger pool cache
Network: Multiple RPC providers for redundancy
Storage: 50GB+ for long-term logging

🔬 Testing Infrastructure

Test Organization

tests/
├── integration/
│   ├── fork_test.go          # Arbitrum fork testing
│   └── [other tests]
├── cache/                     # Cache-related tests
├── contracts/                 # Contract interaction tests
└── scenarios/                 # Test scenarios

pkg/**/..._test.go             # Unit tests colocated with source

Test Coverage

Unit Tests:

arbitrage/: flash_executor_test.go, multihop_test.go
arbitrum/: connection_test.go, parser_test.go, abi_fuzz_test.go
scanner/: concurrent_test.go
security/: keymanager_test.go
validation/: pool_validator_test.go (1,155 lines)

Integration Tests:

End-to-end transaction processing
Multi-protocol detection accuracy
Cross-protocol arbitrage detection

Build & Test Commands:

make build              # Compile binary
make test               # Run all tests
make test-coverage      # Generate coverage report
make test-integration   # Integration tests only
make lint               # Run golangci-lint
make security-scan      # Security analysis (gosec)

🚀 Deployment Guide

Prerequisites

# 1. Go 1.24 or later
go version

# 2. Create encryption key (32 bytes)
openssl rand -hex 16 > .env.encryption_key

# 3. Setup keystore
mkdir -p keystore
chmod 700 keystore

# 4. Configure environment
export GO_ENV=production
export MEV_BOT_ENCRYPTION_KEY=$(cat .env.encryption_key)
export ARBITRUM_RPC_ENDPOINT="wss://your-endpoint"

Quick Start

# Build the binary
make build

# Run the bot
./bin/mev-bot start

# Or with explicit config
GO_ENV=production ./bin/mev-bot start

Production Deployment

1. Configuration:

# Copy and customize production configs
cp config/arbitrum_production.yaml config/arbitrum_production.yaml.local
cp config/providers.yaml config/providers.yaml.local

# Edit with actual RPC endpoints and API keys
vim config/arbitrum_production.yaml.local
vim config/providers.yaml.local

2. Environment Setup:

# Create .env.production file
cat > .env.production <<EOF
GO_ENV=production
MEV_BOT_ENCRYPTION_KEY=<your-32-byte-hex-key>
MEV_BOT_KEYSTORE_PATH=keystore
ARBITRUM_RPC_ENDPOINT=wss://...
ARBITRUM_WS_ENDPOINT=wss://...
LOG_LEVEL=info
METRICS_ENABLED=true
EOF

3. Start Service:

# Load environment and start
source .env.production
./bin/mev-bot start

Monitoring & Health Checks

Production Logging System:

logs/
├── mev_bot.log                # Main application log
├── mev_bot_errors.log         # Error-specific log
├── mev_bot_performance.log    # Performance metrics
├── analytics/                 # Real-time analysis
│   ├── analysis_*.json        # Comprehensive metrics
│   └── dashboard_*.html       # Operations dashboard
├── health/                    # Health monitoring
│   └── health_*.json          # Health reports (97.97/100)
├── archives/                  # Compressed rotated logs
└── rotated/                   # Rotated log files

Health Check Commands:

# Real-time analysis with health scoring
./scripts/log-manager.sh analyze

# Check system health
./scripts/log-manager.sh health

# Full management cycle
./scripts/log-manager.sh full

# Start background monitoring daemon
./scripts/log-manager.sh start-daemon

Alert Thresholds:

Error rate > 10% = Critical
Health score < 80 = Warning
Zero opportunities detected for >1 hour = Investigation needed
Memory usage > 750MB = Pool pruning required

🔒 Security Considerations

Production Security

Key Management:

AES-256-GCM encryption for all private keys
Secure key derivation from master password
Automatic key rotation support
Hardware wallet integration ready

Input Validation:

All external data validated before processing
Token address validation (checksum, zero-address checks)
Amount bounds checking (overflow protection)
Gas price limits (max 50 gwei default)

Rate Limiting:

Per-endpoint rate limits (configurable)
Global transaction rate limiting
Burst allowances for spike handling
Automatic backoff on 429 responses
Circuit breakers on repeated failures

Risk Management

Execution Safeguards:

Configurable slippage protection (0.5% default max)
Maximum transaction value limits (100 ETH default)
Profit validation after gas costs
Simulation before actual execution
Confidence scoring (0.0-1.0 scale)

Error Handling:

Comprehensive error handling at all layers
Automatic retry with exponential backoff
Fallback RPC providers
Graceful degradation on failures

📝 Known Limitations & Future Enhancements

Current Limitations

1. Pool Discovery:

Background discovery disabled (prevents startup hang)
Relies on cached pool data (314 pools)
No automatic new pool detection
Workaround: Manual cache updates or restart

2. Security Manager:

Comprehensive security manager disabled for debugging
KeyManager works independently
Missing some advanced security features

3. MEV Protection:

No Flashbots integration
No MEV-Share participation
Transactions visible on public mempool
Vulnerable to sandwich attacks

4. Single Chain:

Arbitrum only (no Ethereum, Optimism, Base, etc.)
No cross-chain arbitrage
No bridge monitoring

5. In-Memory State:

No persistent opportunity database
Restarts lose historical context
Limited long-term analytics

Planned Enhancements

High Priority:

Re-enable pool discovery (fix hang issue)
Re-enable security manager (identify and fix cause)
Add persistent PostgreSQL database
Implement MEV protection (Flashbots)
Add Prometheus metrics export

Medium Priority:

Multi-chain support (Optimism, Base)
Flash loan integration (capital-free arbitrage)
Machine learning opportunity prediction
Advanced gas optimization
WebSocket dashboard

Low Priority:

MEV-Share integration
Cross-chain bridge monitoring
Collaborative MEV strategies
Historical replay capability

📚 Documentation

Documentation Structure

docs/
├── CODEBASE_EXPLORATION_COMPLETE.md    # Complete codebase analysis
├── IMPLEMENTATION_INSIGHTS.md          # What code actually does
├── CODEBASE_QUICK_REFERENCE.md         # Quick reference guide
├── CODEBASE_EXPLORATION_INDEX.md       # Navigation index
├── DEVELOPER_DOCS.md                   # Developer documentation
├── MONITORING_GUIDE.md                 # Monitoring and operations
├── QUICK_START.md                      # Quick start guide
└── [100+ additional docs]              # Historical and specialized docs

Key Documentation Files

For New Developers:

CODEBASE_QUICK_REFERENCE.md - Start here
CODEBASE_EXPLORATION_COMPLETE.md - Deep dive
DEVELOPER_DOCS.md - Development guidelines

For Operations:

MONITORING_GUIDE.md - Production monitoring
Log manager scripts (scripts/log-manager.sh)
Health check procedures

For Understanding Architecture:

IMPLEMENTATION_INSIGHTS.md - Reality vs documentation
CODEBASE_EXPLORATION_INDEX.md - Component navigation
This specification (PROJECT_SPECIFICATION.md)

🎯 Getting Started

For Developers

1. Understand the codebase:

# Read these in order:
cat docs/CODEBASE_QUICK_REFERENCE.md
cat docs/IMPLEMENTATION_INSIGHTS.md
cat docs/CODEBASE_EXPLORATION_COMPLETE.md

2. Build and test:

make build
make test

3. Run in development:

export GO_ENV=development
export MEV_BOT_ENCRYPTION_KEY=$(openssl rand -hex 16)
./bin/mev-bot start

For Operations

1. Deploy to production:

# Follow deployment guide above
source .env.production
./bin/mev-bot start

2. Monitor health:

# Check health score (target: >95)
./scripts/log-manager.sh health

# Real-time monitoring
./scripts/log-manager.sh start-daemon

3. Troubleshoot issues:

# Analyze logs
./scripts/log-manager.sh analyze

# View latest errors
tail -100 logs/mev_bot_errors.log

# Check specific issues in main.go debug checkpoints (20 total)
grep "CHECKPOINT" logs/mev_bot.log

📊 Performance Expectations

MEV Profit Expectations (Arbitrum Realistic)

Based on current market conditions:

Arbitrage Frequency: 5-20 opportunities per day (market dependent)
Profit per Trade: 0.1-0.5% typical ($2-$10 on $1,000 capital)
Daily Target: $10-$200 with moderate capital and optimal conditions
Time to First Detection: ~30 seconds from startup
Time to First Opportunity: 30-60 minutes (market dependent)

Note: These are detection rates. Actual execution profits depend on:

Gas costs (50-150k gas per execution)
Slippage during execution
Competition from other MEV bots
Market volatility

🔗 External Dependencies

Go Module Dependencies

Primary:

github.com/ethereum/go-ethereum v1.16.3 (Ethereum client library)
github.com/gorilla/websocket v1.5.3 (WebSocket support)
github.com/holiman/uint256 v1.3.2 (256-bit integers)
github.com/urfave/cli/v2 v2.27.5 (CLI framework)
gopkg.in/yaml.v3 (YAML parsing)

Database:

github.com/lib/pq v1.10.9 (PostgreSQL - optional)
github.com/mattn/go-sqlite3 v1.14.32 (SQLite - optional)

Security:

golang.org/x/crypto v0.42.0 (Cryptography)
golang.org/x/time v0.10.0 (Rate limiting)

Testing:

github.com/stretchr/testify v1.11.1 (Test assertions)

Smart Contract Dependencies

Generated Bindings (bindings/):

Arbitrage Executor contract
Flash Swap contracts (Uniswap V2/V3)
ERC20 token interface
Uniswap V3 Pool interface
Balancer Vault interface

📌 Summary

The MEV Bot is a sophisticated, production-grade system with:

✓ Strengths:

Modular, testable architecture (5 layers, 47 packages)
Production-ready security infrastructure
Multi-protocol DEX support (6 protocols)
Intelligent rate limiting and failover
Robust error handling and recovery
Real-time health monitoring (97.97/100 score)
Comprehensive logging and analytics

⚠️ Pragmatic Limitations:

Pool discovery disabled (uses cache: 314 pools)
Security manager disabled (KeyManager works)
No MEV protection (public mempool)
Single-chain only (Arbitrum)
In-memory state (no persistence)

Status: Ready for production with current architecture (cache-based pools, independent KeyManager). Some advanced features disabled pending fixes (pool discovery, security manager).

Recommended Use: Detection and analysis system. Execution capability exists but needs careful testing before live trading.

Last Updated: November 2025 Documentation Version: 2.0 (reflects actual codebase state) Codebase Version: See git commit history for changes

25 KiB Raw Blame History Unescape Escape