System Design Deep Dive: Real-World Distributed Systems/

...

Evaluation of Kafka

Let's recap how Kafka fulfills its promised functionalities.

We'll cover the following...

Performance improvements
- Producer throughput
  - Batch processing
- Consumer throughput
Scalability and distribution support
Data retention
Conclusion
- System design wisdom in Kafka

Kafka promised to be efficient in collecting data from multiple producers in parallel, retaining data, and delivering it to multiple consumers simultaneously. Moreover, it promised to deliver loads of data in real time. Let's go through some pieces of evidence as to how Kafka provides these functionalities by comparing the performance of Kafka with Apache ActiveMQ (a popular open-source implementation of Java Message Service (JMS)) and RabbitMQ (a messaging system known for its performance).

All the computational results and time spentMany of the performance numbers and graphs in this lesson are inspired from the paper: Kreps, Jay, Neha Narkhede, and Jun Rao. "Kafka: A distributed messaging system for log processing." In Proceedings of the NetDB, vol. 11, pp. 1-7. 2011. on them that is stated in the text below are done on two Linux machines, both of which have eight 2GHz cores, 16 GB of memory, and 6 disks with RAID 10. Both the Linux machines are connected through a 1 GB network link. One is deployed as a broker, and the other performs the function of both the producer and consumer interchangeably. Though such an experimental setup might seem minuscule, Kafka can extract amazing throughput from this setup. Since Kafka is horizontally scalable, it will not be a stretch to extrapolate these numbers for a larger setup (for example, for back-of-the-envelope calculations).

Performance improvements

To check the improved performance of Kafka, we’ll have to analyze the messages going from producer to brokers and from brokers to consumers.

Producer throughput

ActiveMQ and RabbitMQ don't have any simple way to send batched messages, so only 1 message is sent to the broker at any given time. However, if ...

Prologue

File Systems

Google File System (GFS)

Google Colossus File System

Facebook's Tectonic File System

Databases

Google Bigtable

Google Megastore

Google Spanner

Key-value Stores

Many-core Key-value Store

Scaling Memcache

SILT

Amazon DynamoDB

Concurrency Management

Two-phase Locking (2PL)

Google Chubby Locking Service

ZooKeeper

Big Data Processing: Batch to Stream Processing

MapReduce

Spark

Kafka

Consensus

Understanding Consensus: Two Generals, FLP, & Byzantine Generals

Two-phase Commit

State Machine Replication

Paxos

Raft

Epilogue

Evaluation of Kafka

Performance improvements

Producer throughput