Batch vs. Stream Processing
Learn about the different types of data processing methods and the key performance metrics for measuring data pipelines.
We'll cover the following
There are two types of processing techniques when building data pipelines for delivering data from a source to a destination:
Batch processing
Stream processing
Before discussing each, it's important to mention the types of systems we can find today, how users interact with them, and how they process their data. We’ll also discuss the two major metrics for measuring pipeline performance: latency and throughput.
Online systems
Today, it’s very common to find systems where we ask for something or send some instruction, and the output is returned a short time later. This is how most of the systems we interact with today work, including caches, search engines, databases, and web servers.
Online systems use a client-server architecture. They consist of one or more servers that host services. These services can send information over the internet to clients with the required authorization. In most cases, the client is a human making a service request and waiting for the response to arrive from a server. They’re called online systems because they constantly wait for a request from a client.
Get hands-on with 1400+ tech skills courses.