MEV & Profitability Research on Arbitrum
Purpose
- Aggregate methodology, tooling, and findings related to identifying MEV and profit opportunities on Arbitrum.
- Provide reproducible guidance so agents can extend experiments without duplicating work.
Current Capabilities Snapshot
- Core services:
cmd/mev-bot,pkg/arbitrage,pkg/transport,pkg/scanner, andpkg/profitcalcimplement the live pipeline. - Monitoring & reporting:
internal/monitoring, Prometheus dashboards, anddocs/8_reports/capture historic profitability metrics. - Simulation tooling:
tools/simulation,make simulate-profit, and artifacts underreports/simulation/enable backtesting.
Research Tracks
1. DEX Price Arbitrage
- Targets: Uniswap v3, Camelot, Sushi, GMX spot pools.
- Signals: Pool reserves, swap events, TWAP deltas, cross-pair spreads.
- KPIs: Expected profit per block, win rate, gas/priority fee sensitivity.
2. Liquidation Monitoring
- Targets: Aave, Radiant, other Arbitrum lending markets.
- Signals: Health factor drift, oracle price updates, pending liquidation calls.
- KPIs: Post-liquidation slippage, competing bot density, execution latency.
3. Cross-Domain / Cross-Chain Opportunities
- Scenarios: L1↔L2 basis gaps, bridge delays, stablecoin depegs.
- Signals: L1 oracle vs L2 pool divergence, bridge queue depth, sequencer backlog.
- KPIs: Net basis capture, transfer latency risk, capital lock-up duration.
4. Latency & Order-Flow Strategies (ethics review required)
- Includes sandwiching, back-running, private order flow analysis.
- Emphasise legal and policy review before experimentation.
External Research Snapshot (as of 2025-10-19)
- Timeboost express lane audit (Sep 2025): Analysis of ~11.5M auctions found over 90% won by two participants, 22% revert rates, weakening secondary markets, and declining DAO revenue—indicating current Timeboost design is centralising order flow and underperforming fairness objectives.
- Spam-based arbitrage on fast-finality rollups (Jun 2025): Shows splitting MEV into many micro transactions remains optimal post-Dencun; on Arbitrum, 80% of reverted swaps concentrate in USDC/WETH pairs and cluster at block tops, signalling a sustained latency race outside priority-fee auctions.
- Optimistic MEV measurement (Jun 2025): Quantifies "on-chain probe" strategies driving 7% of Arbitrum gas usage in Q1 2025 despite limited fee contribution—highlighting speculative load on sequencers and sensitivity to volatility and aggregator activity.
- Cross-chain arbitrage taxonomy (Jan 2025): Longitudinal study across nine chains attributes ~32% of observed events to bridge-based moves, yielding a conservative $9.5M profit lower bound; provides a baseline for assessing Arbitrum cross-domain MEV defences.
- Sequencer profit sustainability (Mar 2025): DAO-commissioned report decomposes sequencer revenues/costs (including blob and L1 settlement fees) and stresses integrating Timeboost and orderflow auctions into long-term economic planning.
- Community proposals and dashboards (Apr–Sep 2025): FairFlow proposal aims to adjust Timeboost parameters for broader participation; community analytics suggest Timeboost revenue is nearing parity with base fees (~$1M/month) with potential to reach $100M annually if adoption expands.
Actionable follow-up: Integrate insights above into experiment backlog—e.g., replicate Timeboost revert analysis locally, extend spam-detection metrics in pkg/scanner, and simulate bridge-based arbitrage using the cross-chain taxonomy as benchmarks.
Data Sources & Access Checklist
- On-chain RPC/archive: Document credentials (Alchemy, Infura, self-hosted nodes) and rate limits.
- Mempool / private relays: Track availability of Flashbots-style endpoints or sequencer feeds.
- Historical datasets: Record storage locations under
data/(Parquet/CSV), retention policies, refresh cadence. - Off-chain signals: Centralised exchange order books, funding rates, oracle feeds.
Dataset Inventory (Initial)
| Path | Description | Refresh Cadence | Notes |
|---|---|---|---|
data/pools.txt |
Seed list of Arbitrum liquidity pool addresses (Uniswap v3, Sushi, Camelot). | Manual | Generated October 2025; extend with TVL, fee tier metadata before backtests. |
data/raw_arbitrum_portal_projects.json |
Raw Arbitrum Portal /api/projects export (all categories). |
Pull ad hoc | Auto-fetched by make refresh-mev-datasets (or run curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json). |
datasets/arbitrum_llama_exchanges.csv |
DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols. | Pull ad hoc | Generated via pull_llama_exchange_snapshot.py (run automatically by make refresh-mev-datasets). |
datasets/arbitrum_portal_exchanges.csv |
Portal-derived exchange list filtered to DEX / Aggregator / Perps / Options / Derivatives. | Pull ad hoc | Generated 2025-10-19 via helper script (see below); retains project IDs, chains, URLs. |
datasets/arbitrum_llama_exchange_subset.csv |
DeFiLlama exchange slice limited to Dexs / DEX Aggregator / Derivatives / Options categories. | Pull ad hoc | Rebuilt 2025-10-19 from arbitrum_llama_exchanges.csv for easier joins (source CSV generated via API pull). |
datasets/arbitrum_exchange_sources.csv |
Combined view of Portal + DeFiLlama exchanges with sources flag. |
Derived | Regenerate after refreshing either upstream dataset to track coverage gaps. |
datasets/arbitrum_lending_markets.csv |
Lending/CDP venues on Arbitrum with TVL + borrowed balances, audit coverage, and oracle support. | Pull ad hoc | Generated 2025-10-19 via update_market_datasets.py; derive liquidation watchlists and oracle dependencies. |
datasets/arbitrum_bridges.csv |
Bridge + cross-chain routing protocols exposing Arbitrum liquidity with share-of-TVL metrics. | Pull ad hoc | Generated 2025-10-19 via update_market_datasets.py; baseline for cross-domain arbitrage monitoring. |
reports/simulation/latest/summary.md |
Most recent profitability simulation output. | Per simulation run | Use as baseline for comparing new opportunity vectors. |
reports/simulation/latest/summary.json |
Machine-readable KPIs from latest simulation. | Per simulation run | Ingest into notebooks for longitudinal analysis. |
reports/ci/ |
CI pipeline logs (lint, gosec, etc.). | Per pipeline run | Useful when correlating security changes with profitability regressions. |
Exchange Dataset Refresh Workflow
Run the following from repo root whenever Portal or DeFiLlama listings change:
# 1. Pull latest Portal catalogue
curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json
# 2. Refresh all MEV research datasets (validates prerequisites automatically)
make refresh-mev-datasets
scripts/refresh-mev-datasets.sh orchestrates the Python regenerators, fetching the latest Portal catalogue and DeFiLlama snapshot before rebuilding downstream CSVs. Set SKIP_PORTAL_FETCH=1 if you already staged a customised Portal dump; direct invocation (pull_llama_exchange_snapshot.py, update_exchange_datasets.py, update_market_datasets.py) remains available for bespoke filters.
Methodology Template
- Define hypothesis & expected alpha source.
- Enumerate required datasets & tooling (ETL scripts, simulations, live hooks).
- Implement deterministic data extraction (commit scripts to
tools/orscripts/). - Run analysis/backtests; save notebooks or summaries under
reports/research/. - Evaluate results (KPIs, risk, infrastructure requirements).
- Record follow-up tasks, blockers, and owners.
Experiment Log Format
YYYY-MM-DD – <experiment title>
Hypothesis:
Setup:
Datasets:
Results:
Risks/Assumptions:
Next Steps:
Artifacts: reports/research/YYYY-MM-DD_<slug>.md
Repository Structure
experiments/– Checked-in summaries of completed experiments (one markdown per study).datasets/– Documentation of raw/processed datasets leveraged during research.tooling/– Notes on scripts, notebooks, and automation supporting experiments.reports/research/(repo root) – Canonical location for detailed experiment artifacts referenced above.
Related datasets:
datasets/arbitrum_exchanges.md– narrative breakdown of major Arbitrum exchanges with metrics and citations.datasets/arbitrum_exchanges.csv– structured CSV for ingesting exchange metadata (variant, category, key notes, source URL).datasets/arbitrum_llama_exchanges.csv– DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols (re-generated automatically from the protocols API).datasets/arbitrum_portal_exchanges.csv– machine-readable Arbitrum Portal exchange list (DEX/Perps/Options/Derivatives).datasets/arbitrum_exchange_sources.csv– merged Portal + DeFiLlama source map with gap indicators.datasets/arbitrum_lending_markets.csv– liquidation/borrowing venue roster with Arbitrum TVL + borrowed metrics and oracle coverage.datasets/arbitrum_bridges.csv– cross-domain bridge inventory with Arbitrum share-of-liquidity statistics for basis/opportunity tracking.verification/arbitrum_pool_verifications.md– verification status tracker for high-priority pools/routers (link back to contract audits).
Tooling Inventory
- Collection: Extend
pkg/scanner,pkg/events, and custom scripts underscripts/to ingest new pools or lending data. - Simulation: Use
tools/simulationwith new vector captures; document command variants. - Analytics: Prefer reproducible notebooks or Go/Polars pipelines; store outputs under
reports/research/. - Security constraints: Align experiments with
pkg/security(rate limiting, key usage); updateTODO_AUDIT_FIX.mdif additional permissions are required.
Compliance & Safety
- Respect RPC provider ToS and relevant regulations (front-running, market manipulation).
- Avoid storing private keys or sensitive order flow in shared logs; follow
docs/6_operations/SECURITY.md. - Coordinate with stakeholders before testing intrusive strategies (e.g., sandwiching live users).
Immediate Next Actions
- Inventory existing Arbitrum datasets and document access details here.
- Select an initial research question (e.g., Uniswap ↔ Camelot price divergence).
- Capture a baseline simulation run; archive outputs under
reports/research/. - Append checklist items within this document as work progresses.