...

Batch vs. Stream Processing

Learn about the different types of data processing methods and the key performance metrics for measuring data pipelines.

We'll cover the following...

Online systems
Batch processing systems
Stream processing systems
Latency vs. throughput

There are two types of processing techniques when building data pipelines for delivering data from a source to a destination:

Batch processing
Stream processing

Before discussing each, it's important to mention the types of systems we can find today, how users interact with them, and how they process their data. We’ll also discuss the two major metrics for measuring pipeline performance: latency and throughput.

Online systems

Today, it’s very common to find systems where we ask for something or send some instruction, and the output is returned a short time later. This is how most of the systems we interact with today work, including caches, search engines, databases, and web servers.

Online systems use a client-server architecture. They consist of one or more servers that host services. These services can send information over the internet to clients with the ...

Introduction

E: Extract

T: Transform

L: Load

Orchestration

ETL Pipeline: Fraud Detection Preprocessing

Conclusion

Build a News ETL Data Pipeline Using Python and SQLite

Batch vs. Stream Processing

Online systems