Aggregate methodology, tooling, and findings related to identifying MEV and profit opportunities on Arbitrum.
Provide reproducible guidance so agents can extend experiments without duplicating work.

Current Capabilities Snapshot

Core services: cmd/mev-bot, pkg/arbitrage, pkg/transport, pkg/scanner, and pkg/profitcalc implement the live pipeline.
Monitoring & reporting: internal/monitoring, Prometheus dashboards, and docs/8_reports/ capture historic profitability metrics.
Simulation tooling: tools/simulation, make simulate-profit, and artifacts under reports/simulation/ enable backtesting.

Research Tracks

1. DEX Price Arbitrage

Targets: Uniswap v3, Camelot, Sushi, GMX spot pools.
Signals: Pool reserves, swap events, TWAP deltas, cross-pair spreads.
KPIs: Expected profit per block, win rate, gas/priority fee sensitivity.

2. Liquidation Monitoring

Targets: Aave, Radiant, other Arbitrum lending markets.
Signals: Health factor drift, oracle price updates, pending liquidation calls.
KPIs: Post-liquidation slippage, competing bot density, execution latency.

3. Cross-Domain / Cross-Chain Opportunities

Scenarios: L1↔L2 basis gaps, bridge delays, stablecoin depegs.
Signals: L1 oracle vs L2 pool divergence, bridge queue depth, sequencer backlog.
KPIs: Net basis capture, transfer latency risk, capital lock-up duration.

4. Latency & Order-Flow Strategies (ethics review required)

Includes sandwiching, back-running, private order flow analysis.
Emphasise legal and policy review before experimentation.

External Research Snapshot (as of 2025-10-19)

Timeboost express lane audit (Sep 2025): Analysis of ~11.5M auctions found over 90% won by two participants, 22% revert rates, weakening secondary markets, and declining DAO revenue—indicating current Timeboost design is centralising order flow and underperforming fairness objectives.
Spam-based arbitrage on fast-finality rollups (Jun 2025): Shows splitting MEV into many micro transactions remains optimal post-Dencun; on Arbitrum, 80% of reverted swaps concentrate in USDC/WETH pairs and cluster at block tops, signalling a sustained latency race outside priority-fee auctions.
Optimistic MEV measurement (Jun 2025): Quantifies "on-chain probe" strategies driving 7% of Arbitrum gas usage in Q1 2025 despite limited fee contribution—highlighting speculative load on sequencers and sensitivity to volatility and aggregator activity.
Cross-chain arbitrage taxonomy (Jan 2025): Longitudinal study across nine chains attributes ~32% of observed events to bridge-based moves, yielding a conservative $9.5M profit lower bound; provides a baseline for assessing Arbitrum cross-domain MEV defences.
Sequencer profit sustainability (Mar 2025): DAO-commissioned report decomposes sequencer revenues/costs (including blob and L1 settlement fees) and stresses integrating Timeboost and orderflow auctions into long-term economic planning.
Community proposals and dashboards (Apr–Sep 2025): FairFlow proposal aims to adjust Timeboost parameters for broader participation; community analytics suggest Timeboost revenue is nearing parity with base fees (~$1M/month) with potential to reach $100M annually if adoption expands.

Actionable follow-up: Integrate insights above into experiment backlog—e.g., replicate Timeboost revert analysis locally, extend spam-detection metrics in pkg/scanner, and simulate bridge-based arbitrage using the cross-chain taxonomy as benchmarks.

Data Sources & Access Checklist

On-chain RPC/archive: Document credentials (Alchemy, Infura, self-hosted nodes) and rate limits.
Mempool / private relays: Track availability of Flashbots-style endpoints or sequencer feeds.
Historical datasets: Record storage locations under data/ (Parquet/CSV), retention policies, refresh cadence.
Off-chain signals: Centralised exchange order books, funding rates, oracle feeds.

Dataset Inventory (Initial)

Path	Description	Refresh Cadence	Notes
`data/pools.txt`	Seed list of Arbitrum liquidity pool addresses (Uniswap v3, Sushi, Camelot).	Manual	Generated October 2025; extend with TVL, fee tier metadata before backtests.
`data/raw_arbitrum_portal_projects.json`	Raw Arbitrum Portal `/api/projects` export (all categories).	Pull ad hoc	Auto-fetched by `make refresh-mev-datasets` (or run `curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json`).
`datasets/arbitrum_llama_exchanges.csv`	DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols.	Pull ad hoc	Generated via `pull_llama_exchange_snapshot.py` (run automatically by `make refresh-mev-datasets`).
`datasets/arbitrum_portal_exchanges.csv`	Portal-derived exchange list filtered to DEX / Aggregator / Perps / Options / Derivatives.	Pull ad hoc	Generated 2025-10-19 via helper script (see below); retains project IDs, chains, URLs.
`datasets/arbitrum_llama_exchange_subset.csv`	DeFiLlama exchange slice limited to Dexs / DEX Aggregator / Derivatives / Options categories.	Pull ad hoc	Rebuilt 2025-10-19 from `arbitrum_llama_exchanges.csv` for easier joins (source CSV generated via API pull).
`datasets/arbitrum_exchange_sources.csv`	Combined view of Portal + DeFiLlama exchanges with `sources` flag.	Derived	Regenerate after refreshing either upstream dataset to track coverage gaps.
`datasets/arbitrum_lending_markets.csv`	Lending/CDP venues on Arbitrum with TVL + borrowed balances, audit coverage, and oracle support.	Pull ad hoc	Generated 2025-10-19 via `update_market_datasets.py`; derive liquidation watchlists and oracle dependencies.
`datasets/arbitrum_bridges.csv`	Bridge + cross-chain routing protocols exposing Arbitrum liquidity with share-of-TVL metrics.	Pull ad hoc	Generated 2025-10-19 via `update_market_datasets.py`; baseline for cross-domain arbitrage monitoring.
`reports/simulation/latest/summary.md`	Most recent profitability simulation output.	Per simulation run	Use as baseline for comparing new opportunity vectors.
`reports/simulation/latest/summary.json`	Machine-readable KPIs from latest simulation.	Per simulation run	Ingest into notebooks for longitudinal analysis.
`reports/ci/`	CI pipeline logs (lint, gosec, etc.).	Per pipeline run	Useful when correlating security changes with profitability regressions.

Exchange Dataset Refresh Workflow

Run the following from repo root whenever Portal or DeFiLlama listings change:

# 1. Pull latest Portal catalogue
curl -s https://portal-data.arbitrum.io/api/projects > data/raw_arbitrum_portal_projects.json

# 2. Refresh all MEV research datasets (validates prerequisites automatically)
make refresh-mev-datasets

scripts/refresh-mev-datasets.sh orchestrates the Python regenerators, fetching the latest Portal catalogue and DeFiLlama snapshot before rebuilding downstream CSVs. Set SKIP_PORTAL_FETCH=1 if you already staged a customised Portal dump; direct invocation (pull_llama_exchange_snapshot.py, update_exchange_datasets.py, update_market_datasets.py) remains available for bespoke filters.

Methodology Template

Define hypothesis & expected alpha source.
Enumerate required datasets & tooling (ETL scripts, simulations, live hooks).
Implement deterministic data extraction (commit scripts to tools/ or scripts/).
Run analysis/backtests; save notebooks or summaries under reports/research/.
Evaluate results (KPIs, risk, infrastructure requirements).
Record follow-up tasks, blockers, and owners.

Experiment Log Format

YYYY-MM-DD – <experiment title>
Hypothesis:
Setup:
Datasets:
Results:
Risks/Assumptions:
Next Steps:
Artifacts: reports/research/YYYY-MM-DD_<slug>.md

Repository Structure

experiments/ – Checked-in summaries of completed experiments (one markdown per study).
datasets/ – Documentation of raw/processed datasets leveraged during research.
tooling/ – Notes on scripts, notebooks, and automation supporting experiments.
reports/research/ (repo root) – Canonical location for detailed experiment artifacts referenced above.

Related datasets:

datasets/arbitrum_exchanges.md – narrative breakdown of major Arbitrum exchanges with metrics and citations.
datasets/arbitrum_exchanges.csv – structured CSV for ingesting exchange metadata (variant, category, key notes, source URL).
datasets/arbitrum_llama_exchanges.csv – DeFiLlama snapshot of all Arbitrum Dex/Derivatives/Options protocols (re-generated automatically from the protocols API).
datasets/arbitrum_portal_exchanges.csv – machine-readable Arbitrum Portal exchange list (DEX/Perps/Options/Derivatives).
datasets/arbitrum_exchange_sources.csv – merged Portal + DeFiLlama source map with gap indicators.
datasets/arbitrum_lending_markets.csv – liquidation/borrowing venue roster with Arbitrum TVL + borrowed metrics and oracle coverage.
datasets/arbitrum_bridges.csv – cross-domain bridge inventory with Arbitrum share-of-liquidity statistics for basis/opportunity tracking.
verification/arbitrum_pool_verifications.md – verification status tracker for high-priority pools/routers (link back to contract audits).

Tooling Inventory

Collection: Extend pkg/scanner, pkg/events, and custom scripts under scripts/ to ingest new pools or lending data.
Simulation: Use tools/simulation with new vector captures; document command variants.
Analytics: Prefer reproducible notebooks or Go/Polars pipelines; store outputs under reports/research/.
Security constraints: Align experiments with pkg/security (rate limiting, key usage); update TODO_AUDIT_FIX.md if additional permissions are required.

Compliance & Safety

Respect RPC provider ToS and relevant regulations (front-running, market manipulation).
Avoid storing private keys or sensitive order flow in shared logs; follow docs/6_operations/SECURITY.md.
Coordinate with stakeholders before testing intrusive strategies (e.g., sandwiching live users).

Immediate Next Actions

Inventory existing Arbitrum datasets and document access details here.
Select an initial research question (e.g., Uniswap ↔ Camelot price divergence).
Capture a baseline simulation run; archive outputs under reports/research/.
Append checklist items within this document as work progresses.

README.md Unescape Escape

MEV & Profitability Research on Arbitrum

Purpose