...

/

Rows vs Columnar Databases

Rows vs Columnar Databases

This lesson explains the differences between row and columnar databases.

Rows vs Columnar Databases

Databases are frequently used to store data. They do so in a manner different from how we visually perceive data arranged in rows. Data is stored in two possible formats:

  • Row Format
  • Columnar Format

We are already familiar with databases that use the row format. These are the traditional relational databases such as MySQL, PostgreSQL, Oracle, and others Row format databases store records as rows in a table. A typical representation of our example Car table, consisting of four columns when stored in a relational database, is like this:

Storing data as rows is great for some use cases. For instance, if we want to retrieve the record for the car BMW from the above table, the row format is suitable because the columns for the BMW row are stored next to each other in memory. The CPU’s read operation benefits from locality of reference. Related data is stored nearby and can be retrieved in a minimal number of memory reads. However, there are other use cases where traditional databases in row format perform abysmally. Imagine that we want to compute the average horsepower of all the cars in our database. To answer this query we need to read-in all the values for the column Horsepower. But these values are part of their corresponding rows. This forces the database engine to read in every row of the table to compute the average horsepower. The database engine, thus, reads columns not involved in the desired calculation. The motivation for columnar databases stems from such use cases, where query times can be significantly reduced if only the required column values are read-in, instead of entire rows.

Example

As an example, ...

Access this course and 1400+ top-rated courses and projects.