# Pool Data Errors - Root Cause Analysis & Fix Plan **Date**: November 3, 2025 **Status**: Active Investigation **Impact**: High - Affecting opportunity executability validation ## Executive Summary Pool data errors are preventing the system from validating opportunities as executable. Currently, **347+ opportunities detected but 0 are marked as executable**, all being rejected due to inability to fetch pool reserve data. ### Key Finding **No opportunities are executable because pool validation is failing silently** rather than returning proper error messages for filtering. --- ## Root Cause Analysis ### 1. **Primary Blocker: RPC Connection Failures** **Evidence:** ``` Post "https://arb1.arbitrum.io/rpc": dial tcp: lookup arb1.arbitrum.io: Temporary failure in name resolution ``` **Impact:** - Primary RPC endpoint unreachable - System falls back to fallback mode (basic block polling) - Pool data cannot be fetched **Current Status:** INTERMITTENT (was happening 12:00-12:12, recovered by 14:11) --- ### 2. **Secondary: Batch Pool Data Fetch Timeouts** **Evidence:** ``` [WARN] Batch fetch failed for 0x42FC852A750BA93D5bf772ecdc857e87a86403a9: no data returned for pool - recording failure [WARN] Failed to fetch batch 0-1: batch fetch V3 data failed: Post "https://arb1.arbitrum.io/rpc": context deadline exceeded ``` **Root Cause:** - Batch fetcher using 10-second timeout - Network latency + RPC overload = frequent timeouts - Pools are being queried one-at-a-time instead of true batch **Affected Code:** `pkg/datafetcher/batch_fetcher.go` **Impact:** - Legitimate pools failing due to timeout - Same pools retried repeatedly (inefficient) - Pools being blacklisted prematurely --- ### 3. **Tertiary: Division by Zero in Smart Contracts** **Evidence:** ``` [WARN] Failed to fetch batch 0-1: batch fetch V3 data failed: execution reverted: division or modulo by zero ``` **Root Causes:** - Querying uninitialized/zero-liquidity pools - Non-standard pool implementations (broken fee() function, etc.) - Smart contract state inconsistencies on L2 **Affected Pools:** ~10-15 pools (from 2025-11-02 logs) --- ### 4. **Quaternary: Non-Standard Pool Implementations** **Evidence:** ``` [ERROR] Error getting pool data for 0xC6962004f452bE9203591991D15f6b388e09E8D0: pool ...is blacklisted: failed to call token1() - non-standard pool contract ``` **Issue:** - Some pools don't follow standard ERC-20 interface - token0(), token1() calls fail - No graceful fallback to skip these pools **Current Handling:** Blacklisting (correct), but error message suggests filtering could be better --- ## Why All Opportunities Show "Not Executable" ### Call Chain: 1. Swap event detected ✅ 2. Opportunity analyzed ✅ 3. **Pool validation triggered for executability check** - Attempts to fetch reserve data - RPC call fails or times out - **Execution marked as false (default)** 4. Opportunity logged with `isExecutable:false` ### The Critical Issue: When pool data can't be fetched, the system **doesn't return proper error context** for intelligent filtering. Instead, it: - Returns nil reserves - Marks as non-executable - Doesn't distinguish between: - "Pool doesn't exist" (skip) - "RPC timeout" (retry) - "Non-standard pool" (blacklist) --- ## System Status ### Watch Script Output - **Opportunities Detected**: 347+ - **Executable**: 0 (all failing pool validation) - **Executions**: 0 - **Errors**: 0 (watch script filters out expected warnings) ### Logs Status ``` 2025/11/03 14:16:11 - Present ✅ Watch script successfully reading logs ✅ Opportunity detection working ❌ Pool validation blocking all executions ``` --- ## Solution Strategy ### Phase 1: Immediate (Next 30 minutes) 1. **Increase batch fetch timeout** from 10s to 30s 2. **Implement exponential backoff** for retry logic 3. **Add proper error context** to distinguish error types ### Phase 2: Short-term (Next hour) 1. **Fix RPC endpoint configuration** if primary is down 2. **Implement batch caching** to avoid repeated failures 3. **Add pool pre-validation** before RPC queries ### Phase 3: Medium-term (Today) 1. **Smart pool filtering** - skip known bad contracts early 2. **Improved monitoring** - track pool failure patterns 3. **Emergency fallback** - use backup RPC providers --- ## Affected Code Files | File | Issue | Priority | |------|-------|----------| | `pkg/datafetcher/batch_fetcher.go` | 10s timeout, no backoff | HIGH | | `pkg/scanner/market/scanner.go` | No error context in pool fetch | HIGH | | `pkg/scanner/market/pool_validator.go` | Pre-validation could filter better | MEDIUM | | `pkg/uniswap/multicall.go` | No fallback for failed calls | MEDIUM | --- ## Metrics to Track - Pool fetch success rate (target: >95%) - RPC timeout frequency (target: <1%) - Pool blacklist size (current: ~10-15) - Opportunity executability rate (current: 0%, target: >5%) --- ## Next Actions 1. Read batch fetcher timeout configuration 2. Implement improved error handling 3. Add retry logic with backoff 4. Test with current opportunity stream 5. Monitor for improvement in executability rate