# MEDIUM-001: Rate Limiting Enhancement - Detailed Fix Plan

**Issue ID:** MEDIUM-001  
**Category:** Security  
**Priority:** Medium  
**Status:** Not Started  
**Generated:** October 9, 2025  
**Estimate:** 3-4 hours  

## Overview
This plan enhances rate limiting mechanisms to prevent abuse and ensure fair resource usage. The implementation will include sliding window rate limiting, distributed rate limiting support, adaptive rate limiting, and bypass detection capabilities.

## Current Implementation Issues
- Basic rate limiting in `pkg/security/keymanager.go:781-823` 
- No distributed rate limiting for multiple instances
- Static rate limits that don't adapt to system load
- No detection mechanism for rate limiting bypass attempts

## Implementation Tasks

### 1. Implement Sliding Window Rate Limiting
**Task ID:** MEDIUM-001.1  
**Time Estimate:** 1.5 hours  
**Dependencies:** None

Replace basic rate limiting with sliding window implementation in `pkg/security/keymanager.go:781-823`:
- Implement sliding window algorithm for more accurate rate limiting
- Track request timestamps within the sliding window
- Calculate requests per time unit dynamically
- Maintain accuracy across time boundaries

```go
import (
    "sync"
    "time"
)

type SlidingWindowRateLimiter struct {
    mu          sync.RWMutex
    windowSize  time.Duration
    maxRequests int
    requests    []time.Time
}

func NewSlidingWindowRateLimiter(windowSize time.Duration, maxRequests int) *SlidingWindowRateLimiter {
    return &SlidingWindowRateLimiter{
        windowSize:  windowSize,
        maxRequests: maxRequests,
        requests:    make([]time.Time, 0),
    }
}

func (rl *SlidingWindowRateLimiter) Allow(key string) bool {
    rl.mu.Lock()
    defer rl.mu.Unlock()
    
    now := time.Now()
    
    // Remove requests outside the window
    windowStart := now.Add(-rl.windowSize)
    filteredRequests := make([]time.Time, 0)
    for _, reqTime := range rl.requests {
        if reqTime.After(windowStart) {
            filteredRequests = append(filteredRequests, reqTime)
        }
    }
    rl.requests = filteredRequests
    
    // Check if we're under the limit
    if len(rl.requests) < rl.maxRequests {
        rl.requests = append(rl.requests, now)
        return true
    }
    
    return false
}

func (rl *SlidingWindowRateLimiter) GetRemaining(key string) int {
    rl.mu.RLock()
    defer rl.mu.RUnlock()
    
    now := time.Now()
    windowStart := now.Add(-rl.windowSize)
    
    count := 0
    for _, reqTime := range rl.requests {
        if reqTime.After(windowStart) {
            count++
        }
    }
    
    return rl.maxRequests - count
}
```

### 2. Add Distributed Rate Limiting Support
**Task ID:** MEDIUM-001.2  
**Time Estimate:** 1 hour  
**Dependencies:** MEDIUM-001.1

Implement distributed rate limiting for multiple instances:
- Use Redis or similar for shared rate limit state
- Implement distributed sliding window algorithm
- Handle Redis connection failures gracefully
- Provide fallback to in-memory limiting if Redis unavailable

```go
type DistributedRateLimiter struct {
    localLimiter   *SlidingWindowRateLimiter
    redisClient    *redis.Client
    windowSize     time.Duration
    maxRequests    int
}

func (drl *DistributedRateLimiter) Allow(key string) bool {
    // Try distributed rate limiting first
    if drl.redisClient != nil {
        return drl.allowDistributed(key)
    }
    
    // Fall back to local rate limiting
    return drl.localLimiter.Allow(key)
}

func (drl *DistributedRateLimiter) allowDistributed(key string) bool {
    now := time.Now().UnixNano()
    windowStart := now - drl.windowSize.Nanoseconds()
    
    // Use Redis to maintain rate limit state across instances
    pipe := drl.redisClient.Pipeline()
    
    // Remove old entries
    pipe.ZRemRangeByScore("rate_limit:"+key, "0", fmt.Sprintf("%d", windowStart))
    
    // Add current request
    pipe.ZAdd("rate_limit:"+key, &redis.Z{
        Score:  float64(now),
        Member: fmt.Sprintf("%d", now),
    })
    
    // Get count in window
    countCmd := pipe.ZCard("rate_limit:" + key)
    
    // Set expiration
    pipe.Expire("rate_limit:"+key, drl.windowSize)
    
    _, err := pipe.Exec()
    if err != nil {
        // Fallback to local limiter on Redis error
        return drl.localLimiter.Allow(key)
    }
    
    count, err := countCmd.Result()
    if err != nil {
        return drl.localLimiter.Allow(key)
    }
    
    return int(count) <= drl.maxRequests
}
```

### 3. Implement Adaptive Rate Limiting
**Task ID:** MEDIUM-001.3  
**Time Estimate:** 1 hour  
**Dependencies:** MEDIUM-001.1, MEDIUM-001.2

Create adaptive rate limiting based on system load:
- Monitor system resources (CPU, memory, network)
- Adjust rate limits based on current load
- Implement different limits for different user tiers
- Provide configurable load thresholds

```go
type AdaptiveRateLimiter struct {
    baseLimiter    RateLimiter
    systemMonitor  *SystemMonitor
    loadThresholds LoadThresholds
}

type LoadThresholds struct {
    lowLoad    int  // requests per second when system load is low
    highLoad   int  // requests per second when system load is high
    cpuHigh    int  // CPU percentage considered high
    memHigh    int  // memory percentage considered high
}

func (arl *AdaptiveRateLimiter) Allow(key string) bool {
    systemLoad := arl.systemMonitor.GetSystemLoad()
    
    // Adjust max requests based on system load
    adjustedMaxRequests := arl.calculateAdjustedLimit(systemLoad)
    
    // Create temporary limiter with adjusted values
    tempLimiter := NewSlidingWindowRateLimiter(
        arl.baseLimiter.WindowSize(),
        adjustedMaxRequests,
    )
    
    return tempLimiter.Allow(key)
}

func (arl *AdaptiveRateLimiter) calculateAdjustedLimit(load *SystemLoad) int {
    // If system is under high load, reduce rate limit
    if load.CPU > arl.loadThresholds.cpuHigh || load.Memory > arl.loadThresholds.memHigh {
        return arl.loadThresholds.highLoad
    }
    
    return arl.loadThresholds.lowLoad
}
```

### 4. Add Rate Limiting Bypass Detection and Alerting
**Task ID:** MEDIUM-001.4  
**Time Estimate:** 0.5 hours  
**Dependencies:** MEDIUM-001.1, MEDIUM-001.2, MEDIUM-001.3

Implement monitoring for rate limiting bypass attempts:
- Detect unusual patterns that might indicate bypass attempts
- Log suspicious activity for analysis
- Send alerts for potential bypass attempts
- Track statistics on bypass detection

```go
func (arl *AdaptiveRateLimiter) detectBypassAttempts(key string, result bool) {
    // Log blocked requests for analysis
    if !result {  // Request was blocked
        // Update metrics
        arl.metrics.IncRateLimitExceeded(key)
        
        // Check for pattern of rapid consecutive requests
        if arl.isBypassPattern(key) {
            arl.logger.Warn("Potential rate limit bypass attempt detected",
                "key", key,
                "timestamp", time.Now().Unix(),
            )
            
            arl.alertSystem.SendAlert("Rate Limit Bypass Attempt", map[string]interface{}{
                "key":       key,
                "timestamp": time.Now().Unix(),
            })
        }
    }
}

func (arl *AdaptiveRateLimiter) isBypassPattern(key string) bool {
    // Implement pattern detection logic
    // This could include things like:
    // - Rapid consecutive blocked requests
    // - Requests from multiple IPs using same key
    // - Requests with unusual timing patterns
    return arl.metrics.GetBlockedRequestsPerMinute(key) > 50
}
```

## Integration with Key Manager

### Enhanced Key Manager with Rate Limiting
```go
type KeyManager struct {
    // ... existing fields
    rateLimiter *DistributedRateLimiter
    // ... other fields
}

func (km *KeyManager) SignTransaction(keyID string, tx *types.Transaction) (*types.Transaction, error) {
    // Check rate limit before signing
    if allowed := km.rateLimiter.Allow(keyID); !allowed {
        km.logger.Warn("Rate limit exceeded for key", "keyID", keyID)
        return nil, fmt.Errorf("rate limit exceeded for key %s", keyID)
    }
    
    // Perform the signing operation
    // ... existing signing logic
}
```

## Testing Strategy
- Unit tests for sliding window algorithm
- Integration tests for distributed rate limiting
- Load testing to verify adaptive behavior
- Negative tests for bypass detection

## Code Review Checklist
- [ ] Sliding window algorithm implemented correctly
- [ ] Distributed rate limiting supports multiple instances
- [ ] Adaptive rate limiting responds to system load
- [ ] Bypass detection and alerting implemented
- [ ] Fallback mechanisms for Redis failures
- [ ] Performance impact is acceptable
- [ ] Tests cover all scenarios

## Rollback Strategy
If issues arise after deployment:
1. Disable distributed rate limiting (use local only)
2. Revert to basic rate limiting implementation
3. Monitor performance and request patterns

## Success Metrics
- Accurate rate limiting with sliding window
- Distributed rate limiting working across instances
- Adaptive rate limiting responding to system load
- Rate limit bypass attempts detected and logged
- No performance degradation beyond acceptable limits