Aggregates

Learn how to use aggregates with and without a group by clause in SQL.

Queries will return as many rows as we select thanks to the where filter. This filter applies against a dataset that is produced by the from clause and its joins between relations.

The outer joins might produce more rows than we have in our reference dataset. In particular, cross join is a cartesian product.

In this section, we’ll have a look at aggregates. They work by computing a digest value for several input rows at a time. We can return a summary containing many fewer rows than passed by the where filter with aggregates.

Aggregates (aka map/reduce): group by

The group by clause introduces aggregates in SQL. It allows implementing much the same thing as map/reduce in other systems: map our data into different groups, and reduce the dataset to a single value in each group.

As a first example, we can count how many races have been run in each decade:

Get hands-on with 1300+ tech skills courses.