Data Races

Learn about data races, including how to avoid them using mutex and how to prevent deadlock in synchronous and asynchronous tasks.

A data race happens when two threads are accessing the same memory simultaneously, and at least one of the threads is mutating the data. If our program has a data race, it means that our program has undefined behavior. The compiler and optimizer will assume that there are no data races in our code and optimize it under that assumption. This may result in crashes or other completely surprising behavior. In other words, we can under no circumstances allow data races in our program. The compiler usually doesn’t warn us about data races since they are hard to detect at compile time.

Note: Debugging data races can be a real challenge and sometimes requires tools such as ThreadSanitizer (from Clang) or Concurrency Visualizer (a Visual Studio extension). These tools typically instrument the code so a runtime library can detect, warn about, or visualize potential data races while running the program we are debugging.

Example: A data race

The diagram below shows two threads that are going to update an integer called counter. Imagine that these threads are both incrementing a global counter variable with the instruction ++counter. It turns out that incrementing an int might involve multiple CPU instructions. This can be done in different ways on different CPUs, but let’s pretend that ++counter generates the following made-up machine instructions:

  • R: Read counter from memory
  • +1: Increment counter
  • W: Write new counter value to memory

Now, if we have two threads that are going to update the counter value that initially is 42, we expect it to become 44 after both threads run. However, as we can see in the following figure, there is no guarantee that the instructions will be executed sequentially to guarantee a correct increment of the counter variable.

Get hands-on with 1200+ tech skills courses.