Symptoms and causes

A crash happens when a program encounters a condition (erroneous in most cases) that it does not have code to handle. The result would be that the program stops abruptly with a backtrace in the logs and, in some cases, a core file, depending on the platform, programming language, etc. So, right away, we’ll know the code path that led to the crash from the backtrace. This code path is a victim of the erroneous condition.

Debugging a crash is figuring out the unhandled code path or condition and what led to this condition. A program has to handle many internal and external conditions, e.g., when reading a file from a disk, dereferencing a pointer, sharing a resource among threads, etc. Many of these conditions could face issues in various ways, resulting in a crash. Our final goal here is to find out what leads to this condition from the symptom, which is the crash.

The backtrace in the crash is a code path affected by the bug. Returning to our analogy, debugging a crash is like solving a murder mystery with just the victim’s last words. The last word could be a name, which would be a good starting point, but more is needed to make a conviction.

The backtrace in the crash is just the code path affected by the bug. We have to get to the bug from this backtrace. The bug could be in any functions specified in the backtrace or elsewhere. Our objective is to go after the cause of this backtrace.

Pattern to debug a crash

In this section, we’ll present a general pattern: an ordered sequence of steps to follow for debugging crashes. The goal is to identify ...

Introduction to Debugging

Bugs Life Cycle

Basic Debugging

Multithreaded Debugging

Code Reading

Crashes and Hangs

Resource Leaks

Debugging Distributed Systems

Scaling Issues

Troubleshooting Environments

Principles for Proactive Product Maintainability

Conclusion

Crashes

Symptoms and causes

Pattern to debug a crash