Files
mev-beta/docs/6_operations/DEPLOYMENT_GUIDE.md
Krypto Kajun 850223a953 fix(multicall): resolve critical multicall parsing corruption issues
- Added comprehensive bounds checking to prevent buffer overruns in multicall parsing
- Implemented graduated validation system (Strict/Moderate/Permissive) to reduce false positives
- Added LRU caching system for address validation with 10-minute TTL
- Enhanced ABI decoder with missing Universal Router and Arbitrum-specific DEX signatures
- Fixed duplicate function declarations and import conflicts across multiple files
- Added error recovery mechanisms with multiple fallback strategies
- Updated tests to handle new validation behavior for suspicious addresses
- Fixed parser test expectations for improved validation system
- Applied gofmt formatting fixes to ensure code style compliance
- Fixed mutex copying issues in monitoring package by introducing MetricsSnapshot
- Resolved critical security vulnerabilities in heuristic address extraction
- Progress: Updated TODO audit from 10% to 35% complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-17 00:12:55 -05:00

14 KiB

MEV Bot L2 Deployment Guide

🚀 Production Deployment - Arbitrum L2 Optimized

This guide covers deploying the MEV bot with full L2 message processing capabilities for maximum competitive advantage on Arbitrum.

Prerequisites

System Requirements

  • CPU: 8+ cores (16+ recommended for high-frequency L2 processing)
  • RAM: 16GB minimum (32GB+ recommended)
  • Storage: 100GB SSD (fast I/O for L2 message processing)
  • Network: High-bandwidth, low-latency connection to Arbitrum nodes

Required Services

  • Arbitrum RPC Provider: Alchemy, Infura, or QuickNode (WebSocket support required)
  • Monitoring: Prometheus + Grafana (optional but recommended)
  • Alerting: Slack/Discord webhooks for real-time alerts

Quick Start

1. Clone and Build

git clone https://github.com/your-org/mev-beta.git
cd mev-beta
go build -o mev-bot ./cmd/mev-bot/main.go

Profitability Monitoring & Simulation

  • Key Prometheus metrics exposed at /metrics/prometheus:
    • mev_bot_net_profit_eth, mev_bot_total_profit_eth, mev_bot_gas_spent_eth
    • mev_bot_trade_error_rate, mev_bot_processing_latency_ms, mev_bot_successful_trades
    • Track these in Grafana to watch hit rate, latency, and cumulative P&L during deployments.
  • Historical replay harness:
    • Run make simulate-profit (or ./scripts/run_profit_simulation.sh <report-dir>) to analyse bundled vectors under tools/simulation/vectors/.
    • The CLI produces JSON and Markdown reports in reports/simulation/latest/ summarising hit rate, gas burn, and per-exchange profitability.
  • Runbook checklist:
    1. Execute the profitability simulation ahead of staging/production releases and attach the Markdown summary to change records.
    2. During rollout, alert if mev_bot_trade_error_rate exceeds 0.25 for more than 10 minutes or if mev_bot_net_profit_eth trends negative across a 15-minute window.
    3. Archive both math audit (reports/math/latest/) and simulation (reports/simulation/latest/) artifacts with deployment notes.

2. Environment Setup

# Create environment file
cat > .env << EOF
# Arbitrum L2 Configuration
ARBITRUM_RPC_ENDPOINT="wss://arb-mainnet.g.alchemy.com/v2/YOUR_ALCHEMY_KEY"
ARBITRUM_WS_ENDPOINT="wss://arb-mainnet.g.alchemy.com/v2/YOUR_ALCHEMY_KEY"

# Performance Tuning
BOT_MAX_WORKERS=25
BOT_CHANNEL_BUFFER_SIZE=1000
BOT_POLLING_INTERVAL=0.25

# Monitoring
METRICS_ENABLED=true
METRICS_PORT=9090

# Key storage locations
MEV_BOT_KEYSTORE_PATH=keystore/production
MEV_BOT_AUDIT_LOG=logs/production_audit.log
MEV_BOT_BACKUP_PATH=backups/production

# Alerting
SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
DISCORD_WEBHOOK="https://discord.com/api/webhooks/YOUR/DISCORD/WEBHOOK"
EOF

Tip: For a ready-to-use smoke test profile, source env/smoke.env. The sample file seeds a compliant encryption key, keystore paths, and metrics defaults so ./mev-bot start can boot locally without exposing production secrets. Replace the placeholder RPC endpoints before connecting to real infrastructure.

Ensure the paths in MEV_BOT_KEYSTORE_PATH, MEV_BOT_AUDIT_LOG, and MEV_BOT_BACKUP_PATH exist on the host; the helper scripts (scripts/run.sh, env/smoke.env) create sane defaults under keystore/, logs/, and backups/ if they are missing.

3. Production Configuration

# Copy production config
cp config/config.production.yaml config/config.yaml

# Update with your API keys
sed -i 's/YOUR_ALCHEMY_KEY/your-actual-key/g' config/config.yaml
sed -i 's/YOUR_INFURA_KEY/your-actual-key/g' config/config.yaml

4. Start the Bot

# With environment variables
source .env && ./mev-bot start

# Or with direct flags
METRICS_ENABLED=true ./mev-bot start

Advanced Deployment

Docker Deployment

# Dockerfile
FROM golang:1.24-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o mev-bot ./cmd/mev-bot/main.go

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/mev-bot .
COPY --from=builder /app/config ./config
CMD ["./mev-bot", "start"]
# Build and run
docker build -t mev-bot .
docker run -d \
  --name mev-bot-production \
  -p 9090:9090 \
  -p 8080:8080 \
  -e ARBITRUM_RPC_ENDPOINT="wss://your-endpoint" \
  -e METRICS_ENABLED=true \
  -v $(pwd)/data:/data \
  -v $(pwd)/logs:/var/log/mev-bot \
  mev-bot
# docker-compose.production.yml
version: '3.8'

services:
  mev-bot:
    build: .
    container_name: mev-bot-l2
    restart: unless-stopped
    ports:
      - "9090:9090"  # Metrics
      - "8080:8080"  # Health checks
    environment:
      - ARBITRUM_RPC_ENDPOINT=${ARBITRUM_RPC_ENDPOINT}
      - ARBITRUM_WS_ENDPOINT=${ARBITRUM_WS_ENDPOINT}
      - BOT_MAX_WORKERS=25
      - BOT_CHANNEL_BUFFER_SIZE=1000
      - METRICS_ENABLED=true
      - LOG_LEVEL=info
    volumes:
      - ./data:/data
      - ./logs:/var/log/mev-bot
      - ./config:/app/config
    networks:
      - mev-network
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  prometheus:
    image: prom/prometheus:latest
    container_name: mev-prometheus
    restart: unless-stopped
    ports:
      - "9091:9090"
    volumes:
      - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
    networks:
      - mev-network

  grafana:
    image: grafana/grafana:latest
    container_name: mev-grafana
    restart: unless-stopped
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana_data:/var/lib/grafana
      - ./monitoring/grafana-dashboard.json:/etc/grafana/provisioning/dashboards/mev-dashboard.json
    networks:
      - mev-network

volumes:
  prometheus_data:
  grafana_data:

networks:
  mev-network:
    driver: bridge

Monitoring Setup

Prometheus Configuration

# monitoring/prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'mev-bot'
    static_configs:
      - targets: ['mev-bot:9090']
    scrape_interval: 5s
    metrics_path: '/metrics/prometheus'

Prometheus loads alert rules from monitoring/alerts.yml to enforce profitability guardrails:

# monitoring/alerts.yml
groups:
  - name: mev-bot-alerts
    rules:
      - alert: MEVBotHighErrorRate
        expr: mev_bot_trade_error_rate > 0.25
        for: 10m
        labels: { severity: critical }
        annotations:
          summary: MEV bot trade error rate is above 25%
          description: Error rate exceeded SLO for 10 minutes; check RPC health and contract execution.

      - alert: MEVBotDegradedProfitFactor
        expr: mev_bot_profit_factor < 1
        for: 15m
        labels: { severity: warning }
        annotations:
          summary: MEV bot profit factor dropped below 1
          description: Profit factor stayed below breakeven (1.0) for 15 minutes; review gas strategy.

Reload Prometheus after updating both prometheus.yml and alerts.yml so the new rules take effect.

Grafana Dashboard

{
  "dashboard": {
    "title": "MEV Bot L2 Monitoring",
    "panels": [
      {
        "title": "L2 Messages/Second",
        "type": "graph",
        "targets": [
          {
            "expr": "mev_bot_l2_messages_per_second"
          }
        ]
      },
      {
        "title": "Net Profit (ETH)",
        "type": "singlestat",
        "targets": [
          {
            "expr": "mev_bot_net_profit_eth"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "mev_bot_error_rate"
          }
        ]
      }
    ]
  }
}

Profitability Monitoring & Simulation

  • Key Prometheus metrics exposed at /metrics/prometheus:
    • mev_bot_net_profit_eth, mev_bot_total_profit_eth, mev_bot_gas_spent_eth
    • mev_bot_trade_error_rate, mev_bot_processing_latency_ms, mev_bot_successful_trades
    • Track these in Grafana to monitor hit rate, latency, and cumulative P&L during deployments.
  • Historical replay harness:
    • Run make simulate-profit (or ./scripts/run_profit_simulation.sh <report-dir>) to analyse bundled vectors under tools/simulation/vectors/.
    • The CLI produces JSON and Markdown reports in reports/simulation/latest/ summarising hit rate, gas burn, and per-exchange profitability.
  • Runbook checklist:
    1. Execute the profitability simulation ahead of staging/production releases and attach the Markdown summary to change records.
    2. During rollout, alert if mev_bot_trade_error_rate exceeds 0.25 for more than 10 minutes or if mev_bot_net_profit_eth trends negative across a 15-minute window.
    3. Archive both math audit (reports/math/latest/) and simulation (reports/simulation/latest/) artifacts with deployment notes.

Performance Optimization

L2 Message Processing Tuning

High-Frequency Settings

# config/config.yaml
bot:
  polling_interval: 0.1        # 100ms polling
  max_workers: 50              # High worker count
  channel_buffer_size: 2000    # Large buffers
  
arbitrum:
  rate_limit:
    requests_per_second: 100   # Aggressive rate limits
    max_concurrent: 50
    burst: 200

Memory Optimization

# System tuning
echo 'vm.swappiness=1' >> /etc/sysctl.conf
echo 'net.core.rmem_max=134217728' >> /etc/sysctl.conf
echo 'net.core.wmem_max=134217728' >> /etc/sysctl.conf
sysctl -p

Database Optimization

database:
  max_open_connections: 100
  max_idle_connections: 50
  connection_max_lifetime: "1h"

Security Configuration

Network Security

# Firewall rules
ufw allow 22/tcp    # SSH
ufw allow 9090/tcp  # Metrics (restrict to monitoring IPs)
ufw allow 8080/tcp  # Health checks (restrict to load balancer)
ufw deny 5432/tcp   # Block database access
ufw enable

API Key Security

# Use environment variables, never hardcode
export ARBITRUM_RPC_ENDPOINT="wss://..."
export ALCHEMY_API_KEY="..."

# Or use secrets management
kubectl create secret generic mev-bot-secrets \
  --from-literal=rpc-endpoint="wss://..." \
  --from-literal=api-key="..."

Process Security

# Run as non-root user
useradd -r -s /bin/false mevbot
chown -R mevbot:mevbot /app
sudo -u mevbot ./mev-bot start

Monitoring and Alerting

Health Checks

# Simple health check
curl http://localhost:8080/health

# Detailed metrics
curl http://localhost:9090/metrics | grep mev_bot

Alert Rules

# alertmanager.yml
groups:
  - name: mev-bot-alerts
    rules:
      - alert: HighErrorRate
        expr: mev_bot_error_rate > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "MEV Bot error rate is high"
          
      - alert: L2MessageLag
        expr: mev_bot_l2_message_lag_ms > 1000
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "L2 message processing lag detected"
          
      - alert: LowProfitability
        expr: mev_bot_net_profit_eth < 0
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Bot is losing money"

Troubleshooting

Common Issues

L2 Message Subscription Failures

# Check WebSocket connectivity
wscat -c wss://arb-mainnet.g.alchemy.com/v2/YOUR_KEY

# Verify endpoints
curl -X POST -H "Content-Type: application/json" \
  --data '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' \
  https://arb-mainnet.g.alchemy.com/v2/YOUR_KEY

High Memory Usage

# Monitor memory
htop
# Check for memory leaks
pprof http://localhost:9090/debug/pprof/heap

Rate Limiting Issues

# Check rate limit status
grep "rate limit" /var/log/mev-bot/mev-bot.log
# Adjust rate limits in config

Logs Analysis

# Real-time log monitoring
tail -f /var/log/mev-bot/mev-bot.log | grep "L2 message"

# Error analysis
grep "ERROR" /var/log/mev-bot/mev-bot.log | tail -20

# Performance metrics
grep "processing_latency" /var/log/mev-bot/mev-bot.log

Backup and Recovery

Data Backup

# Backup database
cp /data/mev-bot-production.db /backup/mev-bot-$(date +%Y%m%d).db

# Backup configuration
tar -czf /backup/mev-bot-config-$(date +%Y%m%d).tar.gz config/

Disaster Recovery

# Quick recovery
systemctl stop mev-bot
cp /backup/mev-bot-YYYYMMDD.db /data/mev-bot-production.db
systemctl start mev-bot

Scaling

Horizontal Scaling

# kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mev-bot-l2
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mev-bot
  template:
    metadata:
      labels:
        app: mev-bot
    spec:
      containers:
      - name: mev-bot
        image: mev-bot:latest
        resources:
          requests:
            cpu: 2000m
            memory: 4Gi
          limits:
            cpu: 4000m
            memory: 8Gi

Load Balancing

# nginx.conf
upstream mev-bot {
    server mev-bot-1:8080;
    server mev-bot-2:8080;
    server mev-bot-3:8080;
}

server {
    listen 80;
    location /health {
        proxy_pass http://mev-bot;
    }
}

Performance Benchmarks

Expected Performance

  • L2 Messages/Second: 500-1000 msgs/sec
  • Processing Latency: <100ms average
  • Memory Usage: 2-4GB under load
  • CPU Usage: 50-80% with 25 workers

Optimization Targets

  • Error Rate: <1%
  • Uptime: >99.9%
  • Profit Margin: >10% after gas costs

Support and Maintenance

Regular Maintenance

# Weekly tasks
./scripts/health-check.sh
./scripts/performance-report.sh
./scripts/profit-analysis.sh

# Monthly tasks
./scripts/log-rotation.sh
./scripts/database-cleanup.sh
./scripts/config-backup.sh

Updates

# Update bot
git pull origin main
go build -o mev-bot ./cmd/mev-bot/main.go
systemctl restart mev-bot

# Update dependencies
go mod tidy
go mod vendor

Quick Commands

# Start with monitoring
METRICS_ENABLED=true ./mev-bot start

# Check health
curl localhost:8080/health

# View metrics
curl localhost:9090/metrics

# Check logs
tail -f logs/mev-bot.log

# Stop gracefully
pkill -SIGTERM mev-bot

Your MEV bot is now ready for production deployment with full L2 message processing capabilities! 🚀