← All Concepts
Resilience
Circuit Breaker Pattern
Prevents a service from repeatedly calling a failing downstream dependency. Fails fast instead of waiting for timeouts, protecting the system from cascading failures.
**Circuit breaker** monitors calls to an external service and stops making calls when failures exceed a threshold.
**Three states:**
1. **Closed** (normal): Requests pass through. Monitor failure rate.
2. **Open** (tripped): Requests immediately fail (fast failure). No calls to downstream.
3. **Half-Open** (testing): Allow limited requests through. If they succeed, move to Closed. If they fail, move back to Open.
**Configuration:**
- **Failure threshold**: Number/percentage of failures to trip (e.g., 50% in 10 seconds)
- **Open duration**: How long to stay open before trying half-open (e.g., 30 seconds)
- **Half-open limit**: Number of test requests in half-open state (e.g., 3)
**What to do when circuit is open:**
- Return cached data (stale but available)
- Return a default/fallback value
- Return an error immediately (fail fast)
- Queue the request for later retry
**Key insight:** Without a circuit breaker, a failing service causes all callers to wait for timeouts. 10 threads × 30 second timeout = 300 seconds of blocked resources. Circuit breaker fails in milliseconds instead.
**Tools**: Resilience4j (Java), Polly (.NET), Hystrix (deprecated), custom implementations.
Common Use Cases
- Protecting against slow or failing external APIs
- Preventing cascading failures in microservices
- Handling database connection pool exhaustion
- Managing third-party service dependencies (payment, email)
Advantages
- +Prevents cascading failures across services
- +Fails fast instead of blocking on timeouts
- +Auto-recovers when downstream service heals
- +Reduces load on struggling services
Disadvantages
- -Requires tuning thresholds per service
- -Can mask real issues if fallbacks hide failures
- -Added complexity in request path
- -Needs monitoring to know when circuits are open