Testing Resiliency
Learn about testing a not-test-friendly Elixir and Erlang feature, resiliency, as well as testing cleanup after crashes.
We'll cover the following...
Testing process crashes
Why does a process crash? Most of the time, it’s because of an unexpected error. In some cases, the best thing to do might be to raise an exception or exit from a process for an error condition that we know can happen (an expected error). However, we find that tends to be the exception rather than the rule, because if the error is expected, we likely want to handle it gracefully (think of a user input error). In cases where we’re raising or exiting on purpose, it might make sense to test that behavior.
Regardless of that, one of the most powerful features of the OTP architecture is that if a process bumps into an unexpected error and crashes, there will likely be a supervisor bringing it back up. That’s not behavior we want to test; supervisors work and have been tested with both automated and on-the-field techniques for decades now.
We don’t really want to test if processes can crash. But if they crash because of an unexpected error, how do we test that if the error itself is unexpected?
So, what do we test?
We don’t want to test that processes are restarted if they crash and we don’t want to test that processes can crash because of unexpected ...