Using Overload Prevention Mechanisms
Learn mechanisms to prevent overwhelming networks and services.
We'll cover the following...
When we have a small set of services, misbehaving applications generally cause small problems. This is because there is usually an overabundance of network capacity to absorb badly behaving applications within a data center, and with a small set of services, it is usually intuitive to figure out what would cause the issue.
When we have a large number of applications running, our network and our machines are usually oversubscribed. Oversubscribed means that our network and systems cannot handle all our applications running at 100%. Oversubscription is common in networks or clusters to control costs. This works because, at any given time, most applications ebb and flow with network traffic, central processing unit (CPU), and memory.
An application that suddenly experiences some type of bug can go into retry loops that quickly overwhelm a service. In addition, if some catastrophic event occurs that takes
a service offline, trying to bring the application back online can cause the service to go down as it is overwhelmed by requests that are queuing on all clients.
Worse is what can happen to the network. If the network becomes overwhelmed or when cloud devices have their queries per second (QPS) exceeded, other applications can have their traffic adversely affected. This can mask the true cause of our problems.
There are several ways of preventing these types of problems, with the two most common being the following:
Circuit breakers
Backoff implementations
...