Preventing Disaster
Learn about load shedding, service self response time check, queuing time, and processing time.
Load shedding
We can see that the best thing to do under high load is turn away work we can’t complete in time. This is called “load shedding,” and it’s the most important way to control incoming demand. Load shedding happens very quickly when a socket’s listen queue is full, and a quick rejection is better than a slow timeout.
More generally, we want to shed load as early as possible so we can avoid tying up resources at several tiers before rejecting the request. Load balancers near the network edge are the ideal place. A good health check on the first tier of services can inform the load balancer when response times are too high, meaning higher than the service’s SLA. The load balancer also needs to be configured to send back an HTTP 503
response code when all instances fail their health checks. That’s a quick response to the caller that says “too busy, try later.”
Get hands-on with 1400+ tech skills courses.