Using Rate Limiters To Prevent Runaway Workflows
Learn how rate limiters can be used to prevent runaway workflows.
We'll cover the following...
DevOps engineers can be responsible for a service that is made up of dozens of microservices. These microservices can then number in the dozens to the tens of thousands of instances running in data centers around the globe. Once a service consists of more than a couple of instances, some form of rate control needs to exist to prevent bad rollouts or configuration changes from causing mass destruction.
Some type of rate limiter for work with forced pause intervals is critical to prevent runaway infrastructure changes.
Rate limiting is easy to implement, but the scope of the rate limiter is going to depend on what our workflows are doing. For services, we may only want one type of change to happen at a time or only affect some number of instances at a time.
The first type of rate limiting would prevent multiple instances of a workflow type from running at a time; for example, we might only want one satellite disk erasure to occur at a time.
The second is to limit the number of devices, services, and so on that can be affected concurrently; for example, we might only want to allow two routers in a region to be taken out for a firmware upgrade.
For rate limiters to be effective, having a single system that executes actions for a set of services can greatly streamline these efforts. This allows centralized enforcement of policies such as rate limiting.
Let's look at the simplest ...