← All Concepts
Resilience

Circuit Breaker Pattern

Prevents a service from repeatedly calling a failing downstream dependency. Fails fast instead of waiting for timeouts, protecting the system from cascading failures.

**Circuit breaker** monitors calls to an external service and stops making calls when failures exceed a threshold. **Three states:** 1. **Closed** (normal): Requests pass through. Monitor failure rate. 2. **Open** (tripped): Requests immediately fail (fast failure). No calls to downstream. 3. **Half-Open** (testing): Allow limited requests through. If they succeed, move to Closed. If they fail, move back to Open. **Configuration:** - **Failure threshold**: Number/percentage of failures to trip (e.g., 50% in 10 seconds) - **Open duration**: How long to stay open before trying half-open (e.g., 30 seconds) - **Half-open limit**: Number of test requests in half-open state (e.g., 3) **What to do when circuit is open:** - Return cached data (stale but available) - Return a default/fallback value - Return an error immediately (fail fast) - Queue the request for later retry **Key insight:** Without a circuit breaker, a failing service causes all callers to wait for timeouts. 10 threads × 30 second timeout = 300 seconds of blocked resources. Circuit breaker fails in milliseconds instead. **Tools**: Resilience4j (Java), Polly (.NET), Hystrix (deprecated), custom implementations.

Common Use Cases

  • Protecting against slow or failing external APIs
  • Preventing cascading failures in microservices
  • Handling database connection pool exhaustion
  • Managing third-party service dependencies (payment, email)

Advantages

  • +Prevents cascading failures across services
  • +Fails fast instead of blocking on timeouts
  • +Auto-recovers when downstream service heals
  • +Reduces load on struggling services

Disadvantages

  • -Requires tuning thresholds per service
  • -Can mask real issues if fallbacks hide failures
  • -Added complexity in request path
  • -Needs monitoring to know when circuits are open