Stopping Crack Propagation

Learn the causes of the airline incident failure, and some of the solutions that could have helped the propagation of crack.

Failure modes of the airline incident

Let’s see how the design of failure modes applies to the grounded airline from before. The airline’s Core Facilities project had not planned out its failure modes. The crack started at the improper handling of the SQLException, but it could have been stopped at many other points. Let’s look at some examples from low-level detail to high-level architecture. Because the pool was configured to block requesting threads when no resources were available, it eventually tied up all request-handling threads. This happened independently in each application server instance.

The pool could have been configured to create more connections if it was exhausted. It also could have ...