Apache Kafka Client Libraries
Learn about the popular clients available for Kafka.
We'll cover the following
Kafka client libraries
The Kafka API is based on a TCP-based binary protocol that defines all client interactions in the form of request-response messages. Instead of client applications directly using this protocol, Kafka provides client libraries in multiple programming languages that abstract away the protocol details and provide a high-level API for interacting with the Kafka cluster. These are made available in the form of the Producer, Consumer, and Admin APIs. The Producer and Consumer APIs allow client applications to write and read data from Kafka topics, respectively, while the Admin API allows client applications to manage topics, brokers, and other Kafka cluster resources.
These client libraries are part of the core Kafka project and are available in multiple programming languages. Some of the most popular ones are for C/C++, Java, Python, Go, Node.js, and .NET.
C/C++ client
librdkafka is a C/C++ client library for Kafka. This client is quite important because other client libraries, such as .NET, Go, Python, etc., use librdkafka. Since these libraries are wrappers around librdkafka, they benefit from its performance and stability. Any changes to the protocol or Kafka cluster are implemented in librdkafka and then ported to all the clients that use it.
This brings us to the concept of native and wrapper clients.
Native vs. wrapper clients
Native clients are those that are written in the language they are intended for, without using other shared libraries (for example, the Java client). Wrapper clients are those that are written in a different language and wrap around a native client. For example, the Python client is written in Python and wraps around the librdkafka client.
In the Kafka ecosystem, it’s common to see a mix of native and wrapper clients, even for the same language. Both have their own advantages and disadvantages. Wrapper clients have an added dependency (on librdkafka) that can make it difficult to install and use. Native clients, on the other hand, are easier to install and use, but they are not as performant as wrapper clients. Wrapper clients tend to keep up with the latest protocol changes and are more likely to support newer features.
Here is a table highlighting these differences:
Native vs. Wrapper Client Libraries
Attribute | Native library | Wrapper library |
Ease of use | Easier to install and use | Difficult to install and use due to the added librdkafka dependency |
Latest features | Takes more time to build native implementations of new client side Kafka features | More likely to support newer features since they are implemented into librdkafka |
Performance | Tends to be mode performant | Performance will have the overhead of invoking librdkafka |
Java
The Java client is part of the core Kafka project. It will be covered in detail throughout the course.
Python
The Python ecosystem has the following two popular clients for Kafka.
kafka-python
: This is a native Python client for Kafka. It is a pure Python implementation that does not depend on other libraries. It is a mature client with a large community and is actively maintained. Its design is similar to that of the Java client with the help of Pythonic interfaces such as consumer iterators.confluent-kafka-python
: This is a wrapper around librdkafka with high-levelProducer
,Consumer
, andAdminClient
API implementations. It is supported and actively maintained by Confluent.
Go
Go is a popular language for working with Kafka. There are many Kafka client library options when it comes to Go, such as:
sarama
: This is a native Go implementation of the Kafka protocol. It is a mature client with a large community and is actively maintained.confluent-kafka-go
: This is a lightweight wrapper around librdkafka and is supported and actively maintained by Confluent.segmentio/kafka-go
: This is a native client that provides both low and high-level APIs for interacting with Kafka to mirror concepts and implement the interfaces of the Go standard library to make it easy to use and integrate with existing software.franz-go
: This is another native client and supports transactions, regex topic consuming, the latest partitioning strategies, data loss detection, closest replica fetching, and more. The interesting part about thefranz-go
client is that it uses code generation for client implementation, which implies that it can support any Kafka protocol-level additions or modifications.
.NET
confluent-kafka-dotnet
: This is a wrapper around librdkafka and is supported and actively maintained by Confluent. Theconfluent-kafka-dotnet
wrapper is derived from the rdkafka-dotnet library. This client provides the following five packages (available via the NuGet package manager):Confluent.Kafka
: This is the core client.Confluent.SchemaRegistry.Serdes.Avro
: This provides a serializer and deserializer for working with the Avro serialized data and Confluent Schema Registry integration.Confluent.SchemaRegistry.Serdes.Protobuf
: This provides a serializer and deserializer for working with the Protobuf serialized data and Confluent Schema Registry integration.Confluent.SchemaRegistry.Serdes.Json
: This provides a serializer and deserializer for working with JSON serialized data and Confluent Schema Registry integration.Confluent.SchemaRegistry
: This is a Confluent Schema Registry client (a dependency of theConfluent.SchemaRegistry.Serdes
package).
Node.js
There are a couple of clients for Node.js as well: kafka-node
and kafkajs
. However, the kafka-node
client is outdated and not maintained. On the other hand, the kafkajs
client is widely used and an actively maintained native client. It supports producers, consumer groups (with pause, resume, and seek), and transactions, as well as AWS IAM-based authentication.
Conclusion
In this lesson, we explored the different types of Kafka clients (native and wrapper) and some widely used libraries across popular programming languages.