Data Storage
This lesson explains how Kafka can be configured to determine when to delete data and the format in which data is stored on disk. Later, the lesson discusses compressing Kafka messages.
We'll cover the following...
Data retention
Kafka doesn’t hold data in perpetuity. The admin can configure Kafka to delete the messages for a topic in two ways:
-
Specify a retention time after which messages are deleted.
-
Specify the data size to be reached before messages are deleted.
In either scenario, Kafka will not wait for consumers to read messages and delete them when the deletion criteria is met. Data for a partition isn’t a contiguous file. Rather, the data is broken into chunks of files called segments. Each segment can be at most 1GB in size or contain a week’s worth of data, whichever is smaller. The segment currently being written to ...