Rate Limiting
Controls the rate of requests to a service. Common algorithms: Token Bucket, Leaky Bucket, Fixed Window, and Sliding Window.
Why Rate Limiting?
Protects services from abuse, DoS attacks, and runaway clients. Required in every API gateway design question.
Algorithms
Token Bucket
A bucket holds N tokens. Each request consumes one token. Tokens replenish at a fixed rate. Allows bursting up to bucket capacity.
Leaky Bucket
Requests enter a queue and are processed at a fixed rate. Smooths out bursts but introduces latency.
Fixed Window Counter
Count requests per time window (e.g., 100 req/min). Simple but has a boundary spike problem: 200 requests can slip through at the window boundary.
Sliding Window Log
Store a timestamp for each request. Count requests in the last N seconds. Accurate but memory-intensive at scale.
Sliding Window Counter
Hybrid: fixed window + weighted previous window. Best accuracy/memory trade-off for production systems.
Distributed Rate Limiting
For multi-node APIs, use Redis INCR with TTL for atomic counters. Lua scripts for atomic check-and-increment to prevent race conditions.