← All Concepts
Architecture
Rate Limiting
Controlling the number of requests a client can make to an API within a given time window to prevent abuse and protect resources.
**Rate limiting** protects your system from being overwhelmed by too many requests.
**Algorithms:**
- **Token Bucket**: Tokens refill at fixed rate. Request consumes token. Allows bursts up to bucket size. Most popular.
- **Leaking Bucket**: Requests queued and processed at fixed rate. Smooth output, strict rate enforcement.
- **Fixed Window**: Count requests in fixed time windows (e.g., per minute). Simple but burst at window boundaries.
- **Sliding Window Log**: Track timestamp of each request. Accurate but memory-heavy.
- **Sliding Window Counter**: Weighted average of current and previous window. Good balance of accuracy and memory.
**Implementation:**
- Use Redis for distributed rate limiting: INCR + EXPIRE in Lua script for atomicity.
- Key format: `rate:user_id:window` or `rate:ip:window`
- Return HTTP 429 Too Many Requests with Retry-After header.
**Where to apply:** API Gateway, per-service, per-user, per-IP, per-endpoint.
Common Use Cases
- Protecting APIs from abuse and DDoS
- Enforcing API usage quotas per client/plan
- Preventing brute-force login attempts
- Controlling resource usage in multi-tenant systems
Advantages
- +Protects backend services from overload
- +Fair resource distribution across clients
- +Prevents abuse and scraping
- +Can implement tiered pricing (different limits per plan)
Disadvantages
- -Adds complexity and slight latency
- -Distributed rate limiting requires shared state (Redis)
- -Can affect legitimate users during traffic spikes
- -Fixed windows have burst issues at boundaries