Conclusion
Review the topics covered in this course and learn about the next steps.
We'll cover the following
Congratulations! You have successfully completed the Kafka Streams for Software Developers course
What we learned
We started by learning that the need for a real-time stream processing API on top of Kafka’s Consumer and Producer APIs led to the creation of Kafka Streams. We covered some of Kafka Streams’ strengths, such as being a lightweight library, its scalability, and fault tolerance. We also briefly covered some basic Kafka Streams concepts, such as processors and topologies.
We then focused on reinforcing our understanding of Kafka concepts and tools, which are essential for understanding and working with Kafka Streams. The tools included important Kafka CLI commands:
Consuming messages using
kafka-console-consumer
Producing messages using
kafka-console-producer
Creating and inspecting topics using
kafka-topics
Inspecting consumer groups using
kafka-consumer-groups
We learned about the internal data structure of a Kafka topic, which is a log with partitions, and how they enable horizontal scaling of Kafka data and parallelism, together with consumer groups.
With a good foundation of Kafka concepts and tools, we learned about stateless processing with Kafka Streams by building an actual Kafka Streams application. Our application used a variety of stateless Kafka Streams operations:
The
filter
operator decides if a message should be further processed in a topology.The
split
andbranch
operators to split a stream into multiple streams.The
map
andmapValues
operators for a one-to-one mapping of values.The
flatMap
andflatMapValues
operators for one-to-many mapping of values.The
merge
operator to merge multiple streams into one.The
to
operator for publishing messages back to Kafka.
We also learned about serialization and deserialization of messages using Serde classes and how to create unit tests for our topologies using the kafka-streams-test-utils
library. Then, we learned about the three types of errors in a Kafka Streams application: entry, processor, and exit errors, and how they can be handled using configuration properties and custom handlers.
Then, we proceeded to learn about Kafka Streams stateful processing, first by introducing the concepts of State Store and KTable and then by introducing the reduce
, aggregate
, count
, groupBy
, and groupByKey
operations. We continued with the more advanced topic of windowing, covering the tumbling, hopping, sliding, and session windows. Finally, we learned how interactive queries allow us to easily expose the data stored in state stores and how and why we should implement remote interactive queries in distributed applications.
We finished with learning about building a Kafka Streams application with the Spring Boot framework, covering two different ways to implement the integration of the two. We also covered how Spring Boot’s Actuator library facilitates the monitoring of Kafka Streams applications.
With all the knowledge and practice we gained throughout the course, we successfully built our own Kafka Streams order processing application, including stateless and stateful operations and a state store exposed using an interactive query, an impressive feat!
Next steps
Implementing a Kafka Streams application in a real-life environment deployed in production always requires extra configuration and fine-tuning. There are many configuration properties that might affect the way an application using Kafka (and Kafka Streams) behaves. A good reference can be found in the Apache Kafka developer guide.
The same is true for integrating Spring Boot with Kafka Streams. There are many ways to tweak and customize things, and as we have seen, there are multiple ways to implement the integration. The best place to look for answers when questions arise is Spring’s official documentation and tutorials. These vary between different integration methods and Spring versions, so make sure to use the correct one.
A topic that was out of scope for this course but might be interesting for advanced users of Kafka Streams is the low-level processor API (as opposed to the high-level DSL), but it is not required for the implementation of most Kafka Streams applications.
Get hands-on with 1400+ tech skills courses.