# MEDIUM-001: Rate Limiting Enhancement - Detailed Fix Plan **Issue ID:** MEDIUM-001 **Category:** Security **Priority:** Medium **Status:** Not Started **Generated:** October 9, 2025 **Estimate:** 3-4 hours ## Overview This plan enhances rate limiting mechanisms to prevent abuse and ensure fair resource usage. The implementation will include sliding window rate limiting, distributed rate limiting support, adaptive rate limiting, and bypass detection capabilities. ## Current Implementation Issues - Basic rate limiting in `pkg/security/keymanager.go:781-823` - No distributed rate limiting for multiple instances - Static rate limits that don't adapt to system load - No detection mechanism for rate limiting bypass attempts ## Implementation Tasks ### 1. Implement Sliding Window Rate Limiting **Task ID:** MEDIUM-001.1 **Time Estimate:** 1.5 hours **Dependencies:** None Replace basic rate limiting with sliding window implementation in `pkg/security/keymanager.go:781-823`: - Implement sliding window algorithm for more accurate rate limiting - Track request timestamps within the sliding window - Calculate requests per time unit dynamically - Maintain accuracy across time boundaries ```go import ( "sync" "time" ) type SlidingWindowRateLimiter struct { mu sync.RWMutex windowSize time.Duration maxRequests int requests []time.Time } func NewSlidingWindowRateLimiter(windowSize time.Duration, maxRequests int) *SlidingWindowRateLimiter { return &SlidingWindowRateLimiter{ windowSize: windowSize, maxRequests: maxRequests, requests: make([]time.Time, 0), } } func (rl *SlidingWindowRateLimiter) Allow(key string) bool { rl.mu.Lock() defer rl.mu.Unlock() now := time.Now() // Remove requests outside the window windowStart := now.Add(-rl.windowSize) filteredRequests := make([]time.Time, 0) for _, reqTime := range rl.requests { if reqTime.After(windowStart) { filteredRequests = append(filteredRequests, reqTime) } } rl.requests = filteredRequests // Check if we're under the limit if len(rl.requests) < rl.maxRequests { rl.requests = append(rl.requests, now) return true } return false } func (rl *SlidingWindowRateLimiter) GetRemaining(key string) int { rl.mu.RLock() defer rl.mu.RUnlock() now := time.Now() windowStart := now.Add(-rl.windowSize) count := 0 for _, reqTime := range rl.requests { if reqTime.After(windowStart) { count++ } } return rl.maxRequests - count } ``` ### 2. Add Distributed Rate Limiting Support **Task ID:** MEDIUM-001.2 **Time Estimate:** 1 hour **Dependencies:** MEDIUM-001.1 Implement distributed rate limiting for multiple instances: - Use Redis or similar for shared rate limit state - Implement distributed sliding window algorithm - Handle Redis connection failures gracefully - Provide fallback to in-memory limiting if Redis unavailable ```go type DistributedRateLimiter struct { localLimiter *SlidingWindowRateLimiter redisClient *redis.Client windowSize time.Duration maxRequests int } func (drl *DistributedRateLimiter) Allow(key string) bool { // Try distributed rate limiting first if drl.redisClient != nil { return drl.allowDistributed(key) } // Fall back to local rate limiting return drl.localLimiter.Allow(key) } func (drl *DistributedRateLimiter) allowDistributed(key string) bool { now := time.Now().UnixNano() windowStart := now - drl.windowSize.Nanoseconds() // Use Redis to maintain rate limit state across instances pipe := drl.redisClient.Pipeline() // Remove old entries pipe.ZRemRangeByScore("rate_limit:"+key, "0", fmt.Sprintf("%d", windowStart)) // Add current request pipe.ZAdd("rate_limit:"+key, &redis.Z{ Score: float64(now), Member: fmt.Sprintf("%d", now), }) // Get count in window countCmd := pipe.ZCard("rate_limit:" + key) // Set expiration pipe.Expire("rate_limit:"+key, drl.windowSize) _, err := pipe.Exec() if err != nil { // Fallback to local limiter on Redis error return drl.localLimiter.Allow(key) } count, err := countCmd.Result() if err != nil { return drl.localLimiter.Allow(key) } return int(count) <= drl.maxRequests } ``` ### 3. Implement Adaptive Rate Limiting **Task ID:** MEDIUM-001.3 **Time Estimate:** 1 hour **Dependencies:** MEDIUM-001.1, MEDIUM-001.2 Create adaptive rate limiting based on system load: - Monitor system resources (CPU, memory, network) - Adjust rate limits based on current load - Implement different limits for different user tiers - Provide configurable load thresholds ```go type AdaptiveRateLimiter struct { baseLimiter RateLimiter systemMonitor *SystemMonitor loadThresholds LoadThresholds } type LoadThresholds struct { lowLoad int // requests per second when system load is low highLoad int // requests per second when system load is high cpuHigh int // CPU percentage considered high memHigh int // memory percentage considered high } func (arl *AdaptiveRateLimiter) Allow(key string) bool { systemLoad := arl.systemMonitor.GetSystemLoad() // Adjust max requests based on system load adjustedMaxRequests := arl.calculateAdjustedLimit(systemLoad) // Create temporary limiter with adjusted values tempLimiter := NewSlidingWindowRateLimiter( arl.baseLimiter.WindowSize(), adjustedMaxRequests, ) return tempLimiter.Allow(key) } func (arl *AdaptiveRateLimiter) calculateAdjustedLimit(load *SystemLoad) int { // If system is under high load, reduce rate limit if load.CPU > arl.loadThresholds.cpuHigh || load.Memory > arl.loadThresholds.memHigh { return arl.loadThresholds.highLoad } return arl.loadThresholds.lowLoad } ``` ### 4. Add Rate Limiting Bypass Detection and Alerting **Task ID:** MEDIUM-001.4 **Time Estimate:** 0.5 hours **Dependencies:** MEDIUM-001.1, MEDIUM-001.2, MEDIUM-001.3 Implement monitoring for rate limiting bypass attempts: - Detect unusual patterns that might indicate bypass attempts - Log suspicious activity for analysis - Send alerts for potential bypass attempts - Track statistics on bypass detection ```go func (arl *AdaptiveRateLimiter) detectBypassAttempts(key string, result bool) { // Log blocked requests for analysis if !result { // Request was blocked // Update metrics arl.metrics.IncRateLimitExceeded(key) // Check for pattern of rapid consecutive requests if arl.isBypassPattern(key) { arl.logger.Warn("Potential rate limit bypass attempt detected", "key", key, "timestamp", time.Now().Unix(), ) arl.alertSystem.SendAlert("Rate Limit Bypass Attempt", map[string]interface{}{ "key": key, "timestamp": time.Now().Unix(), }) } } } func (arl *AdaptiveRateLimiter) isBypassPattern(key string) bool { // Implement pattern detection logic // This could include things like: // - Rapid consecutive blocked requests // - Requests from multiple IPs using same key // - Requests with unusual timing patterns return arl.metrics.GetBlockedRequestsPerMinute(key) > 50 } ``` ## Integration with Key Manager ### Enhanced Key Manager with Rate Limiting ```go type KeyManager struct { // ... existing fields rateLimiter *DistributedRateLimiter // ... other fields } func (km *KeyManager) SignTransaction(keyID string, tx *types.Transaction) (*types.Transaction, error) { // Check rate limit before signing if allowed := km.rateLimiter.Allow(keyID); !allowed { km.logger.Warn("Rate limit exceeded for key", "keyID", keyID) return nil, fmt.Errorf("rate limit exceeded for key %s", keyID) } // Perform the signing operation // ... existing signing logic } ``` ## Testing Strategy - Unit tests for sliding window algorithm - Integration tests for distributed rate limiting - Load testing to verify adaptive behavior - Negative tests for bypass detection ## Code Review Checklist - [ ] Sliding window algorithm implemented correctly - [ ] Distributed rate limiting supports multiple instances - [ ] Adaptive rate limiting responds to system load - [ ] Bypass detection and alerting implemented - [ ] Fallback mechanisms for Redis failures - [ ] Performance impact is acceptable - [ ] Tests cover all scenarios ## Rollback Strategy If issues arise after deployment: 1. Disable distributed rate limiting (use local only) 2. Revert to basic rate limiting implementation 3. Monitor performance and request patterns ## Success Metrics - Accurate rate limiting with sliding window - Distributed rate limiting working across instances - Adaptive rate limiting responding to system load - Rate limit bypass attempts detected and logged - No performance degradation beyond acceptable limits