Files
mev-beta/docs/planning/07_MEDIUM-002_Input_Validation_Enhancement_Plan.md
Krypto Kajun 850223a953 fix(multicall): resolve critical multicall parsing corruption issues
- Added comprehensive bounds checking to prevent buffer overruns in multicall parsing
- Implemented graduated validation system (Strict/Moderate/Permissive) to reduce false positives
- Added LRU caching system for address validation with 10-minute TTL
- Enhanced ABI decoder with missing Universal Router and Arbitrum-specific DEX signatures
- Fixed duplicate function declarations and import conflicts across multiple files
- Added error recovery mechanisms with multiple fallback strategies
- Updated tests to handle new validation behavior for suspicious addresses
- Fixed parser test expectations for improved validation system
- Applied gofmt formatting fixes to ensure code style compliance
- Fixed mutex copying issues in monitoring package by introducing MetricsSnapshot
- Resolved critical security vulnerabilities in heuristic address extraction
- Progress: Updated TODO audit from 10% to 35% complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-17 00:12:55 -05:00

11 KiB

MEDIUM-002: Input Validation Strengthening - Detailed Fix Plan

Issue ID: MEDIUM-002
Category: Security
Priority: Medium
Status: Not Started
Generated: October 9, 2025
Estimate: 4-5 hours

Overview

This plan strengthens input validation throughout the codebase to prevent injection attacks, buffer overflows, and other security vulnerabilities. The focus is on enhancing ABI decoding validation, implementing comprehensive bounds checking, and creating robust input sanitization.

Current Implementation Issues

  • Insufficient validation in ABI decoding and parsing modules
  • Missing bounds checking for external data
  • Potential for injection attacks through unvalidated inputs
  • Lack of comprehensive input sanitization for log messages

Implementation Tasks

1. Enhance ABI Decoding Validation Throughout Parsing Modules

Task ID: MEDIUM-002.1
Time Estimate: 1.5 hours
Dependencies: None

Strengthen ABI decoding validation with comprehensive checks:

  • Validate function signatures against known function selectors
  • Check input parameter types match expected schema
  • Validate length of dynamic parameters
  • Implement bounds checking for array parameters
  • Add strict validation of encoded data
type ABIValidator struct {
    knownFunctions map[string]bool
    maxLengths     map[string]int  // max length for different types
}

func (av *ABIValidator) ValidateFunctionCall(encodedData []byte) error {
    if len(encodedData) < 4 {
        return fmt.Errorf("encoded data too short for function selector")
    }
    
    // Extract function selector (first 4 bytes)
    selector := hex.EncodeToString(encodedData[:4])
    
    // Validate function selector against known functions
    if !av.knownFunctions[selector] {
        return fmt.Errorf("unknown function selector: %s", selector)
    }
    
    // Validate remaining data length
    if len(encodedData) < 36 { // minimum for one parameter
        return fmt.Errorf("insufficient data for expected parameters")
    }
    
    return nil
}

func (av *ABIValidator) ValidateParameter(param interface{}, paramType string) error {
    switch paramType {
    case "address":
        if addr, ok := param.(common.Address); ok {
            if addr == (common.Address{}) {
                return fmt.Errorf("invalid empty address")
            }
        } else {
            return fmt.Errorf("invalid address type")
        }
    case "uint256":
        if val, ok := param.(*big.Int); ok {
            if val.Sign() < 0 {
                return fmt.Errorf("negative value for unsigned type")
            }
            // Check for maximum allowed value to prevent overflow
            maxVal := new(big.Int).Lsh(big.NewInt(1), 256)
            maxVal.Sub(maxVal, big.NewInt(1))
            if val.Cmp(maxVal) > 0 {
                return fmt.Errorf("value exceeds uint256 maximum")
            }
        } else {
            return fmt.Errorf("invalid uint256 type")
        }
    case "string", "bytes":
        if str, ok := param.(string); ok {
            if len(str) > av.maxLengths["string"] {
                return fmt.Errorf("string parameter exceeds maximum length of %d", av.maxLengths["string"])
            }
        } else {
            return fmt.Errorf("invalid string/bytes type")
        }
    }
    
    return nil
}

2. Add Comprehensive Bounds Checking for External Data

Task ID: MEDIUM-002.2
Time Estimate: 1.5 hours
Dependencies: MEDIUM-002.1

Implement bounds checking for all external data inputs:

  • Validate array lengths before processing
  • Check string lengths against maximum allowed values
  • Verify numeric ranges for expected parameters
  • Implement size limits for contract data
  • Add validation for transaction parameters
type BoundsChecker struct {
    maxArrayLength    int
    maxStringLength   int
    maxTransactionGas uint64
    maxBlockNumber    *big.Int
}

func (bc *BoundsChecker) ValidateArrayBounds(data interface{}) error {
    switch v := data.(type) {
    case []interface{}:
        if len(v) > bc.maxArrayLength {
            return fmt.Errorf("array length %d exceeds maximum allowed %d", 
                len(v), bc.maxArrayLength)
        }
    case []byte:
        if len(v) > bc.maxArrayLength {
            return fmt.Errorf("byte array length %d exceeds maximum allowed %d", 
                len(v), bc.maxArrayLength)
        }
    }
    return nil
}

func (bc *BoundsChecker) ValidateTransactionLimits(tx *types.Transaction) error {
    // Validate gas limit
    if tx.Gas() > bc.maxTransactionGas {
        return fmt.Errorf("gas limit %d exceeds maximum allowed %d", 
            tx.Gas(), bc.maxTransactionGas)
    }
    
    // Validate gas price is reasonable
    gasPrice := tx.GasPrice()
    if gasPrice != nil && gasPrice.Cmp(big.NewInt(100000000000)) > 0 { // 100 gwei
        return fmt.Errorf("gas price %s exceeds reasonable maximum", gasPrice.String())
    }
    
    // Validate value is not excessive
    value := tx.Value()
    maxEth := new(big.Int).Exp(big.NewInt(10), big.NewInt(21), nil) // 1000 ETH in wei
    if value != nil && value.Cmp(maxEth) > 0 {
        return fmt.Errorf("transaction value %s exceeds reasonable maximum", value.String())
    }
    
    return nil
}

3. Implement Input Sanitization for Log Messages

Task ID: MEDIUM-002.3
Time Estimate: 0.5 hours
Dependencies: MEDIUM-002.1

Add sanitization for potentially unsafe data in log messages:

  • Sanitize addresses, private keys, and other sensitive data
  • Remove or mask potentially harmful content
  • Implement safe logging functions
  • Prevent log injection attacks
func SanitizeForLog(data string) string {
    // Remove or replace potentially harmful characters
    // Replace newlines to prevent log injection
    data = strings.ReplaceAll(data, "\n", "\\n")
    data = strings.ReplaceAll(data, "\r", "\\r")
    
    // Mask potential addresses or private keys
    // This is a simplified example - consider using regex for more sophisticated masking
    re := regexp.MustCompile(`0x[a-fA-F0-9]{40}|0x[a-fA-F0-9]{64}`)
    data = re.ReplaceAllStringFunc(data, func(match string) string {
        if len(match) == 42 { // Ethereum address
            return "0x" + match[2:6] + "..." + match[len(match)-4:] // Mask middle
        } else if len(match) == 66 { // Private key
            return "0x" + match[2:6] + "..." + match[len(match)-4:] // Mask middle
        }
        return match
    })
    
    return data
}

// Safe structured logging function
func SafeLog(l *Logger, level string, msg string, keyvals ...interface{}) {
    safeKeyvals := make([]interface{}, len(keyvals))
    for i := 0; i < len(keyvals); i++ {
        if i%2 == 0 {
            // Key is even-indexed, expect string
            safeKeyvals[i] = keyvals[i]
        } else {
            // Value at odd index, sanitize if string
            if str, ok := keyvals[i].(string); ok {
                safeKeyvals[i] = SanitizeForLog(str)
            } else {
                safeKeyvals[i] = keyvals[i]
            }
        }
    }
    
    l.Log(level, msg, safeKeyvals...)
}

4. Create Fuzzing Test Suite for All Input Validation Functions

Task ID: MEDIUM-002.4
Time Estimate: 1 hour
Dependencies: MEDIUM-002.1, MEDIUM-002.2, MEDIUM-002.3

Develop comprehensive fuzzing tests for all input validation functions:

  • Fuzz ABI decoding functions with random inputs
  • Test bounds checking with extreme values
  • Validate sanitization functions against malicious inputs
  • Implement property-based tests for validation logic
func FuzzABIValidation(f *testing.F) {
    // Add interesting seeds for ABI validation
    f.Add([]byte{0x00, 0x00, 0x00, 0x00}) // Invalid function selector
    f.Add([]byte{0x12, 0x34, 0x56, 0x78}) // Random function selector
    f.Add([]byte{0x60, 0xFE, 0xED, 0xDE}) // Common function selector prefix
    
    f.Fuzz(func(t *testing.T, data []byte) {
        // Test that validation doesn't panic with random data
        validator := NewABIValidator()
        _ = validator.ValidateFunctionCall(data)
    })
}

func FuzzSanitization(f *testing.F) {
    f.Add("normal string")
    f.Add("string\nwith\nnewlines")
    f.Add("0x1234567890123456789012345678901234567890") // Address format
    f.Add("0x1234567890123456789012345678901234567890123456789012345678901234") // Key format
    
    f.Fuzz(func(t *testing.T, input string) {
        // Test that sanitization doesn't panic
        result := SanitizeForLog(input)
        
        // Validate that result doesn't contain dangerous characters
        if strings.Contains(result, "\n") || strings.Contains(result, "\r") {
            t.Errorf("Sanitization failed to remove newlines from: %s", input)
        }
    })
}

5. Implement Centralized Validation Framework

Task ID: MEDIUM-002.5
Time Estimate: 0.5 hours
Dependencies: MEDIUM-002.1, MEDIUM-002.2, MEDIUM-002.3

Create a centralized validation framework for consistent input validation:

  • Standardized validation interface
  • Reusable validation functions
  • Consistent error handling
  • Configuration for validation parameters
type Validator interface {
    Validate(data interface{}) error
    Sanitize(data interface{}) (interface{}, error)
}

type ValidatorChain struct {
    validators []Validator
}

func (vc *ValidatorChain) Validate(data interface{}) error {
    for _, v := range vc.validators {
        if err := v.Validate(data); err != nil {
            return fmt.Errorf("validation failed with validator %T: %w", v, err)
        }
    }
    return nil
}

// Usage example
func ValidateTransactionInput(txData map[string]interface{}) error {
    validator := &ValidatorChain{
        validators: []Validator{
            &ABIValidator{},
            &BoundsChecker{
                maxArrayLength:  100,
                maxStringLength: 10000,
            },
            &Sanitizer{},
        },
    }
    
    return validator.Validate(txData)
}

Implementation Details

Security Focus Areas

  • ABI decoding validation prevents malicious contract interactions
  • Bounds checking prevents buffer overflows and resource exhaustion
  • Log sanitization prevents log injection attacks
  • Comprehensive input validation prevents injection attacks

Performance Considerations

  • Validation should have minimal performance impact
  • Caching for frequently validated patterns
  • Asynchronous validation for non-critical paths

Testing Strategy

  • Unit tests for each validation function
  • Integration tests with real contract data
  • Fuzzing tests for robustness
  • Property-based testing for validation logic
  • Negative tests with malicious inputs

Code Review Checklist

  • All external inputs are validated before processing
  • Bounds checking implemented for arrays and strings
  • ABI decoding validation prevents malicious inputs
  • Log sanitization prevents injection attacks
  • Fuzzing tests implemented for all validation functions
  • Error handling is consistent and informative
  • Performance impact is measured and acceptable

Rollback Strategy

If issues arise after deployment:

  1. Temporarily disable enhanced validation
  2. Revert to basic validation mechanisms
  3. Monitor for any processing failures

Success Metrics

  • Zero successful injection attacks through validated inputs
  • All input validation tests pass consistently
  • No performance degradation beyond acceptable thresholds
  • Proper error handling for all validation failures
  • Successful detection of malicious inputs