What is Apache Kafka?

Apache Kafka is an open-source messaging system designed to be fast, scalable, and durable. At its core, Apache Kafka has a commit log that acts like other messaging systems (e.g. RabbitMQ) and includes the ability to publish data immediately or laterhence its durable characteristic. Additionally, it can partition the log to allow for parallel consumption of messages, supporting high throughput rates.

Apache Kafka also includes a producer client called Kafka Producer that can send data into the system and a consumer client called Kafka Consumer that allows clients to subscribe to topics inside of Kafka. Using these clients or any other tool that supports the Kafka protocol makes it easy to integrate with many programming languages.

Apache Kafka logo

Kafka queue

A producer sends messages into Kafka that are sent into a specific topic. Consumers can then subscribe to one or more topics, which then allows them to receive a stream of data from Kafka.

To receive a stream of data, a consumer subscribes to one or more topics and receives all messages in those subscribed topics, until it is unsubscribed. This ability to subscribe to topics enables the power of message-driven development, where systems can be decoupled from one another.

The need for Apache Kafka

Apache Kafka was created to solve the common problem encountered by companies that deal with large-scale message processing. Before Kafka, each application used its messaging system to send messages into other systems. This resulted in a proliferation of these systems, which needed to be managed independently or via ad hoc integration code.

It also meant that the message processing/message queues were often not geographically close to the sender and receiver, which can result in slow communication rates.

In comparison, Kafka provides a solution where the producer and consumer applications are completely decoupled. This means that rather than acting as a simple point-to-point queue for communication between systems, it expands this model to allow for the movement of messages between topics (or even multiple topics).

This gives applications the ability to subscribe to topics, which can provide greater ease in building integrations between systems since any message that is written into a topic can be consumed by one or many consumers.

Benefits of Apache Kafka

Apache Kafka has many benefits over using multiple message queues. Here are some of its top benefits:

  • Provides at least one delivery.
  • Messages are never lost.
  • Enables data integration between systems.
  • Enables message-driven development, where application components can be decoupled from one another.

Supported programming languages

Kafka has client libraries available for several programming languages that allow data to be written/read from the server. Each of these clients is compatible with each other. Popular programming languages that are supported are Java, JavaScript, C/C++, Python, Ruby, .NET, Scala, Python, Go, and many more.

Free Resources