← All Concepts
Architecture

Rate Limiting

Controlling the number of requests a client can make to an API within a given time window to prevent abuse and protect resources.

**Rate limiting** protects your system from being overwhelmed by too many requests. **Algorithms:** - **Token Bucket**: Tokens refill at fixed rate. Request consumes token. Allows bursts up to bucket size. Most popular. - **Leaking Bucket**: Requests queued and processed at fixed rate. Smooth output, strict rate enforcement. - **Fixed Window**: Count requests in fixed time windows (e.g., per minute). Simple but burst at window boundaries. - **Sliding Window Log**: Track timestamp of each request. Accurate but memory-heavy. - **Sliding Window Counter**: Weighted average of current and previous window. Good balance of accuracy and memory. **Implementation:** - Use Redis for distributed rate limiting: INCR + EXPIRE in Lua script for atomicity. - Key format: `rate:user_id:window` or `rate:ip:window` - Return HTTP 429 Too Many Requests with Retry-After header. **Where to apply:** API Gateway, per-service, per-user, per-IP, per-endpoint.

Common Use Cases

  • Protecting APIs from abuse and DDoS
  • Enforcing API usage quotas per client/plan
  • Preventing brute-force login attempts
  • Controlling resource usage in multi-tenant systems

Advantages

  • +Protects backend services from overload
  • +Fair resource distribution across clients
  • +Prevents abuse and scraping
  • +Can implement tiered pricing (different limits per plan)

Disadvantages

  • -Adds complexity and slight latency
  • -Distributed rate limiting requires shared state (Redis)
  • -Can affect legitimate users during traffic spikes
  • -Fixed windows have burst issues at boundaries

Related Concepts

API GatewayCachingdistributed systems