Group-Based Aggregation

Why do we need to group?

This is going to be a little bit more difficult, but we’ll answer this question in small steps.

If we list the orders_items, we see that we have the price and the quantity columns.

mysql> select * from order_items;
+----+----------+------------+-------+----------+
| id | order_id | product_id | price | quantity |
+----+----------+------------+-------+----------+
|  4 |        2 |          1 | 50.00 |        2 |
|  5 |        3 |          1 | 50.00 |        1 |
|  6 |        3 |          2 | 19.80 |        1 |
|  7 |        4 |          2 | 19.80 |        5 |
|  8 |        5 |          3 |  5.00 |        1 |
|  9 |        6 |          3 |  5.00 |        2 |
| 10 |        7 |          2 | 19.80 |        1 |
+----+----------+------------+-------+----------+
7 rows in set (0.00 sec)

The total amount of the first order with an id value of 2 is 2 x 50.00 = 100.00. The total amount of the order with an id value of 3 is 1 x 50.00 + 1 x 19.80 = 69.80, and the total amount of the order with an id value of 4 is 5 x 19.80 = 99.00.

This means that the order_items table has all the necessary information to build a result set like the following.

mysql> 
+----------+------------------+
| order_id | amount_per_order |
+----------+------------------+
|        2 |           100.00 |
|        3 |            69.80 |
|        4 |            99.00 |
|        5 |             5.00 |
|        6 |            10.00 |
|        7 |            19.80 |
+----------+------------------+
6 rows in set (0.01 sec)

How can we get there? Let’s try the following first:

mysql> select order_items.order_id, order_items.price * order_items.quantity as amount_per_order_item from order_items;
+----------+-----------------------+
| order_id | amount_per_order_item |
+----------+-----------------------+
|        2 |                100.00 |
|        3 |                 50.00 |
|        3 |                 19.80 |
|        4 |                 99.00 |
|        5 |                  5.00 |
|        6 |                 10.00 |
|        7 |                 19.80 |
+----------+-----------------------+
7 rows in set (0.00 sec)

The result set above doesn’t differ much from the following:


mysql> select * from order_items;
+----+----------+------------+-------+----------+
| id | order_id | product_id | price | quantity |
+----+----------+------------+-------+----------+
|  4 |        2 |          1 | 50.00 |        2 |
|  5 |        3 |          1 | 50.00 |        1 |
|  6 |        3 |          2 | 19.80 |        1 |
|  7 |        4 |          2 | 19.80 |        5 |
|  8 |        5 |          3 |  5.00 |        1 |
|  9 |        6 |          3 |  5.00 |        2 |
| 10 |        7 |          2 | 19.80 |        1 |
+----+----------+------------+-------+----------+
7 rows in set (0.00 sec)

However, there are a few differences, as listed below:

  • It doesn’t display the id column.
  • It doesn’t display the product_id column.
  • The two price and quantity columns have been multiplied and now give the amount_per_order_item.

As we can see, the snippet order_items.price * order_items.quantity as amount_per_order_item multiplies the values of the two columns and sets a label to the result. This is why the new derived-value column is displayed with the label amount_per_order_item.

Cool! Now, let’s try another concept:

mysql> select sum(price), sum(quantity) from order_items;
+------------+---------------+
| sum(price) | sum(quantity) |
+------------+---------------+
|     169.40 |            13 |
+------------+---------------+
1 row in set (0.01 sec)

The sum() function can be used on a column. Then, SQL will sum the values of that column for all the rows that match the result set criteria. The sum() is an aggregate SQL function. We’ll learn more about that later on.

Show the total quantity of each order

mysql> select order_items.order_id, sum(price), sum(quantity) from order_items group by order_id;
+----------+------------+---------------+
| order_id | sum(price) | sum(quantity) |
+----------+------------+---------------+
|        2 |      50.00 |             2 |
|        3 |      69.80 |             2 |
|        4 |      19.80 |             5 |
|        5 |       5.00 |             1 |
|        6 |       5.00 |             2 |
|        7 |      19.80 |             1 |
+----------+------------+---------------+
6 rows in set (0.00 sec)

This is even better because it applies the aggregate functions (the two sum() function calls) to all rows that match the criteria, but separately for each order_id. Double-check the amounts. Do the prices of order_id 2 sum up to 50.00? Do the prices of order_id 3 sum up to 69.80? What about the sums on the quantity columns? Do they match?

It seems that we’re getting closer and closer to our target. We’re only left to combine the information from price and quantity before summing up. The combination needs to be a multiplication.

Show the total amount of each order

mysql> select order_items.order_id, sum(price * quantity) as amount_per_order from order_items group by order_id;
+----------+------------------+
| order_id | amount_per_order |
+----------+------------------+
|        2 |           100.00 |
|        3 |            69.80 |
|        4 |            99.00 |
|        5 |             5.00 |
|        6 |            10.00 |
|        7 |            19.80 |
+----------+------------------+
6 rows in set (0.00 sec)

Perfect! We now display the sums of prices multiplied by quantities, but we limit the summation/aggregation to the matching order_ids so that there’s one summation/aggregation for each order_id.

Practice in the given widget below for better understanding:

Get hands-on with 1300+ tech skills courses.