Fail Fast

Learn about system failure responses, fail fast benefits and issues and input validations.

Fail fast

If slow responses are worse than no response, the worst must surely be a slow failure response. Can there be any bigger waste of system resources than burning cycles and clock time only to throw away the result? If the system can determine in advance that it will fail at an operation, it’s always better to fail fast. That way, the caller doesn’t have to tie up any of its capacity waiting and can get on with other work. How can the system tell whether it will fail? Do we need Deep Learning? Don’t worry, you won’t need to hire a cadre of data scientists.

What violates a fail fast pattern

It’s actually much more mundane than that. There’s a large class of “resource unavailable” failures. For example, when a load balancer makes a connection request but not one of the servers in its service pool is functioning, it should immediately refuse the connection. Some configurations have the load balancer queue the connection request for a while in the hopes that a server will become available in a short period of time. This violates the Fail Fast pattern. The application or service can tell from the incoming request or message roughly what database connections and ...