# MEV & Profitability Research on Arbitrum ## Purpose - Aggregate methodology, tooling, and findings related to identifying MEV and profit opportunities on Arbitrum. - Provide reproducible guidance so agents can extend experiments without duplicating work. ## Current Capabilities Snapshot - **Core services**: `cmd/mev-bot`, `pkg/arbitrage`, `pkg/transport`, `pkg/scanner`, and `pkg/profitcalc` implement the live pipeline. - **Monitoring & reporting**: `internal/monitoring`, Prometheus dashboards, and `docs/8_reports/` capture historic profitability metrics. - **Simulation tooling**: `tools/simulation`, `make simulate-profit`, and artifacts under `reports/simulation/` enable backtesting. ## Research Tracks ### 1. DEX Price Arbitrage - Targets: Uniswap v3, Camelot, Sushi, GMX spot pools. - Signals: Pool reserves, swap events, TWAP deltas, cross-pair spreads. - KPIs: Expected profit per block, win rate, gas/priority fee sensitivity. ### 2. Liquidation Monitoring - Targets: Aave, Radiant, other Arbitrum lending markets. - Signals: Health factor drift, oracle price updates, pending liquidation calls. - KPIs: Post-liquidation slippage, competing bot density, execution latency. ### 3. Cross-Domain / Cross-Chain Opportunities - Scenarios: L1↔L2 basis gaps, bridge delays, stablecoin depegs. - Signals: L1 oracle vs L2 pool divergence, bridge queue depth, sequencer backlog. - KPIs: Net basis capture, transfer latency risk, capital lock-up duration. ### 4. Latency & Order-Flow Strategies *(ethics review required)* - Includes sandwiching, back-running, private order flow analysis. - Emphasise legal and policy review before experimentation. ## External Research Snapshot (as of 2025-10-19) - **Timeboost express lane audit (Sep 2025):** Analysis of ~11.5M auctions found over 90% won by two participants, 22% revert rates, weakening secondary markets, and declining DAO revenue—indicating current Timeboost design is centralising order flow and underperforming fairness objectives. - **Spam-based arbitrage on fast-finality rollups (Jun 2025):** Shows splitting MEV into many micro transactions remains optimal post-Dencun; on Arbitrum, 80% of reverted swaps concentrate in USDC/WETH pairs and cluster at block tops, signalling a sustained latency race outside priority-fee auctions. - **Optimistic MEV measurement (Jun 2025):** Quantifies "on-chain probe" strategies driving 7% of Arbitrum gas usage in Q1 2025 despite limited fee contribution—highlighting speculative load on sequencers and sensitivity to volatility and aggregator activity. - **Cross-chain arbitrage taxonomy (Jan 2025):** Longitudinal study across nine chains attributes ~32% of observed events to bridge-based moves, yielding a conservative $9.5M profit lower bound; provides a baseline for assessing Arbitrum cross-domain MEV defences. - **Sequencer profit sustainability (Mar 2025):** DAO-commissioned report decomposes sequencer revenues/costs (including blob and L1 settlement fees) and stresses integrating Timeboost and orderflow auctions into long-term economic planning. - **Community proposals and dashboards (Apr–Sep 2025):** FairFlow proposal aims to adjust Timeboost parameters for broader participation; community analytics suggest Timeboost revenue is nearing parity with base fees (~$1M/month) with potential to reach $100M annually if adoption expands. *Actionable follow-up*: Integrate insights above into experiment backlog—e.g., replicate Timeboost revert analysis locally, extend spam-detection metrics in `pkg/scanner`, and simulate bridge-based arbitrage using the cross-chain taxonomy as benchmarks. ## Data Sources & Access Checklist - **On-chain RPC/archive**: Document credentials (Alchemy, Infura, self-hosted nodes) and rate limits. - **Mempool / private relays**: Track availability of Flashbots-style endpoints or sequencer feeds. - **Historical datasets**: Record storage locations under `data/` (Parquet/CSV), retention policies, refresh cadence. - **Off-chain signals**: Centralised exchange order books, funding rates, oracle feeds. ### Dataset Inventory (Initial) | Path | Description | Refresh Cadence | Notes | | --- | --- | --- | --- | | `data/pools.txt` | Seed list of Arbitrum liquidity pool addresses (Uniswap v3, Sushi, Camelot). | Manual | Generated October 2025; extend with TVL, fee tier metadata before backtests. | | `data/raw_arbitrum_portal_projects.json` | Raw Arbitrum Portal `/api/projects` export (all categories). | Pull ad hoc | Auto-fetched by `make refresh-mev-datasets` (or run `curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json`). | | `datasets/arbitrum_llama_exchanges.csv` | DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols. | Pull ad hoc | Generated via `pull_llama_exchange_snapshot.py` (run automatically by `make refresh-mev-datasets`). | | `datasets/arbitrum_portal_exchanges.csv` | Portal-derived exchange list filtered to DEX / Aggregator / Perps / Options / Derivatives. | Pull ad hoc | Generated 2025-10-19 via helper script (see below); retains project IDs, chains, URLs. | | `datasets/arbitrum_llama_exchange_subset.csv` | DeFiLlama exchange slice limited to Dexs / DEX Aggregator / Derivatives / Options categories. | Pull ad hoc | Rebuilt 2025-10-19 from `arbitrum_llama_exchanges.csv` for easier joins (source CSV generated via API pull). | | `datasets/arbitrum_exchange_sources.csv` | Combined view of Portal + DeFiLlama exchanges with `sources` flag. | Derived | Regenerate after refreshing either upstream dataset to track coverage gaps. | | `datasets/arbitrum_lending_markets.csv` | Lending/CDP venues on Arbitrum with TVL + borrowed balances, audit coverage, and oracle support. | Pull ad hoc | Generated 2025-10-19 via `update_market_datasets.py`; derive liquidation watchlists and oracle dependencies. | | `datasets/arbitrum_bridges.csv` | Bridge + cross-chain routing protocols exposing Arbitrum liquidity with share-of-TVL metrics. | Pull ad hoc | Generated 2025-10-19 via `update_market_datasets.py`; baseline for cross-domain arbitrage monitoring. | | `reports/simulation/latest/summary.md` | Most recent profitability simulation output. | Per simulation run | Use as baseline for comparing new opportunity vectors. | | `reports/simulation/latest/summary.json` | Machine-readable KPIs from latest simulation. | Per simulation run | Ingest into notebooks for longitudinal analysis. | | `reports/ci/` | CI pipeline logs (lint, gosec, etc.). | Per pipeline run | Useful when correlating security changes with profitability regressions. | #### Exchange Dataset Refresh Workflow Run the following from repo root whenever Portal or DeFiLlama listings change: ```bash # 1. Pull latest Portal catalogue curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json # 2. Refresh all MEV research datasets (validates prerequisites automatically) make refresh-mev-datasets ``` `scripts/refresh-mev-datasets.sh` orchestrates the Python regenerators, fetching the latest Portal catalogue and DeFiLlama snapshot before rebuilding downstream CSVs. Set `SKIP_PORTAL_FETCH=1` if you already staged a customised Portal dump; direct invocation (`pull_llama_exchange_snapshot.py`, `update_exchange_datasets.py`, `update_market_datasets.py`) remains available for bespoke filters. ## Methodology Template 1. Define hypothesis & expected alpha source. 2. Enumerate required datasets & tooling (ETL scripts, simulations, live hooks). 3. Implement deterministic data extraction (commit scripts to `tools/` or `scripts/`). 4. Run analysis/backtests; save notebooks or summaries under `reports/research/`. 5. Evaluate results (KPIs, risk, infrastructure requirements). 6. Record follow-up tasks, blockers, and owners. ### Experiment Log Format ``` YYYY-MM-DD – Hypothesis: Setup: Datasets: Results: Risks/Assumptions: Next Steps: Artifacts: reports/research/YYYY-MM-DD_.md ``` ### Repository Structure - `experiments/` – Checked-in summaries of completed experiments (one markdown per study). - `datasets/` – Documentation of raw/processed datasets leveraged during research. - `tooling/` – Notes on scripts, notebooks, and automation supporting experiments. - `reports/research/` (repo root) – Canonical location for detailed experiment artifacts referenced above. **Related datasets:** - `datasets/arbitrum_exchanges.md` – narrative breakdown of major Arbitrum exchanges with metrics and citations. - `datasets/arbitrum_exchanges.csv` – structured CSV for ingesting exchange metadata (variant, category, key notes, source URL). - `datasets/arbitrum_llama_exchanges.csv` – DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols (re-generated automatically from the protocols API). - `datasets/arbitrum_portal_exchanges.csv` – machine-readable Arbitrum Portal exchange list (DEX/Perps/Options/Derivatives). - `datasets/arbitrum_exchange_sources.csv` – merged Portal + DeFiLlama source map with gap indicators. - `datasets/arbitrum_lending_markets.csv` – liquidation/borrowing venue roster with Arbitrum TVL + borrowed metrics and oracle coverage. - `datasets/arbitrum_bridges.csv` – cross-domain bridge inventory with Arbitrum share-of-liquidity statistics for basis/opportunity tracking. - `verification/arbitrum_pool_verifications.md` – verification status tracker for high-priority pools/routers (link back to contract audits). ## Tooling Inventory - **Collection**: Extend `pkg/scanner`, `pkg/events`, and custom scripts under `scripts/` to ingest new pools or lending data. - **Simulation**: Use `tools/simulation` with new vector captures; document command variants. - **Analytics**: Prefer reproducible notebooks or Go/Polars pipelines; store outputs under `reports/research/`. - **Security constraints**: Align experiments with `pkg/security` (rate limiting, key usage); update `TODO_AUDIT_FIX.md` if additional permissions are required. ## Compliance & Safety - Respect RPC provider ToS and relevant regulations (front-running, market manipulation). - Avoid storing private keys or sensitive order flow in shared logs; follow `docs/6_operations/SECURITY.md`. - Coordinate with stakeholders before testing intrusive strategies (e.g., sandwiching live users). ## Immediate Next Actions 1. Inventory existing Arbitrum datasets and document access details here. 2. Select an initial research question (e.g., Uniswap ↔ Camelot price divergence). 3. Capture a baseline simulation run; archive outputs under `reports/research/`. 4. Append checklist items within this document as work progresses.