...

/

Dealing with Data Inconsistencies in GFS

Dealing with Data Inconsistencies in GFS

Learn how applications and GFS deal with data inconsistencies.

Inconsistencies dealt with by applications

In the previous lesson, we saw the states of file regions after data mutations (random write/record appends). In some cases, the file region becomes undefined or inconsistent. Let’s look at how the applications using GFS deal with these cases.

Undefined regions

Undefined regions are produced due to the concurrent execution of random write operations on the overlapping region. GFS doesn't serialize concurrent writes. Applications that want to use random writes need to be wary of conflicting regions. There can be many ways to avoid undefined regions at the application level. One way is to serialize the concurrent random writes at the same offset at the application level and always write the same length records so that one write operation completely overwrites the previous write (to avoid mixing data from multiple writes producing undefined regions). Otherwise, concurrent writes can be applied to nonoverlapping regions without an issue.

The application using the GFS needs to somehow control the concurrent writes to the same regions by its multiple threads if it can't ...