Circuit Breakers—Theory

Get introduced to the concept of circuit breakers and learn why they are extremely useful in building defensive checks for scalable microservices.

What is a circuit breaker?

Circuit breakers are a pattern used in software development to improve the stability and resilience of systems. The basic idea is to create a connector between two systems, which can be opened or closed depending on the status of the downstream system, i.e., the system we are calling in our API. If the downstream system is experiencing problems or is down, the circuit breaker is opened and requests are redirected to a fallback mechanism. This fallback mechanism could vary wildly depending on the desired outcome. Because it essentially is an error handling mechanism, depending on usage and purpose, we could return a simple error message, or, if the original operation was fetching data from the source, we could return a stale version from the cache. This approach helps to reduce the load on the downstream system and prevent cascading failures.

Let’s say our service was to be interacting with another service that is struggling at the moment due to overload, infrastructure failure, or some other issue. That service’s owners must surely be trying to resuscitate the service by mitigating whatever problems are weighing it down. If, at that moment, we keep spamming further traffic to them, which is pointless because it will all fail anyway, we will only compound matters for them as well. Best to give them some breathing room instead.

States of a circuit breaker

A circuit breaker essentially goes through the following states:

  1. Closed: This is the default state when we start, wherein the connector is closed and joins us to our downstream. All requests are sent ahead in this state.

  2. Half-open (optional): We transition to this state when we face an error count from our downstream that is greater than a certain threshold. We will let a certain percentage of our incoming requests through but the rest will go to our fallback mechanism. This will allow for our downstream to stabilize in case it is unable to handle the full load for some reason.

  3. Open: If the error count persists or worsens even in a half-open state, we will open the connector, therefore breaking the connection between us and our downstream, effectively preventing any calls from going through. All incoming requests will be handled by our fallback mechanism.

  4. Half-open: Attempts at recovery will be made after a time-out in the open state.

  5. Closed (success) / open (failure): Depending on the error rate in the half-open state, we will move to the closed or open state.

Get hands-on with 1400+ tech skills courses.