Kafka Consumer
This lesson explains the Kafka consumer and its programmatic equivalent, the KafkaConsumer class..
We'll cover the following
Kafka consumers are instances of the class KafkaConsumer
, which is a parallel of the KafkaProducer
class used to instantiate producers. The KafkaConsumer
class requires three compulsory properties to be provided: the location of the servers bootstrap.servers
, the key deserializer key.deserializer
, and the value deserializer value.deserializer
. The deserializers convert byte arrays into Java objects.
When creating a KafkaConsumer
we can also specify a consumer group id using the property group.id
. We can create Kafka consumers that don’t belong to any consumer group, but this practice is uncommon. When subscribing to topics, the consumer has the choice to subscribe to a single topic or use a regular expression to match multiple topics.
Poll loop
The general pattern for Kafka consumers is to poll for new messages and process them in a perpetual loop, often referred to as the poll loop. Within the poll loop, the poll()
method takes a timeout interval that the consumer blocks for when there’s no data in the consumer buffer. If the interval value is set to 0
, the method returns immediately. It is essential that the consumer keeps polling the broker, as the “I am alive” heartbeats are sent as part of the poll method.
In the newer versions of Kafka, the heartbeat can be configured to send messages in-between consumer application data polling requests. In the older versions of Kafka, where the poll()
method solely sent out the heartbeats, a consumer may be presumed to be dead because of a long interval between consecutive polls.
The long interval can be caused by several factors, including a prolonged time to process the received data from the previous poll call, a garbage collection pause in the JVM, or simply an OS running on hardware with limited capabilities having a hard time scheduling consumer applications for processing.
The first time poll()
is invoked by a new consumer, the invocation is responsible for finding the GroupCoordinator
, joining the consumer group, and receiving a partition assignment.
The following code widget demonstrates the code written for a Kafka consumer that polls for events from all topics matching the regular expression datajek-*
. The poll loop appears highlighted in the widget.
Get hands-on with 1400+ tech skills courses.