Distributed Systems: Building Software for the Real World/

...

Demand Control

Learn about how a system can crash, socket limitation under heavy requests, ethernet, and long responses leading users to retry.

We'll cover the following...

System crash
How systems fail
Sockets limitation
Ethernet
Firing a retry

System crash

In the old days of mainframes in glass houses, we could predict what the workload looked like from day to day. Operators would measure how many MIPS (millions of instructions per second) a given job needed. Those days are long gone. Most of our services are either directly or indirectly exposed to the entire world’s population.

Our daily reality is this: the world can crush our systems at any time. There’s no natural protection. We have to build it. There are two basic strategies: either refuse work or scale out. For the moment, we’ll consider when, where, and how to refuse work.

How systems fail

Every failing system starts with a queue backing up somewhere. When thinking about request/reply workload, we need to consider the resources being consumed and the queues to get access to those resources. That’ll let us decide where to cut off new requests. Each request obviously consumes a socket on each tier it passes through. While the request is active on an instance, that instance has one fewer ephemeral socket available for new requests. In ...

Living in Production

The Exception That Grounded an Airline

Stabilize Your System

Stability Antipatterns

Failures And Blockages

Force Multiplier

Stability Patterns

Launching An Online Store

Foundations

Processes on Machines

Interconnect

Control Plane

Security

Design for Deployment

Handling Versions

Case Study: Trampled by Your Own Customers

Adaptation

System Architecture

Information Architecture

Chaos Engineering

Bibliography

Demand Control

System crash

How systems fail