Group-Based Aggregation
Learn to group data.
We'll cover the following
Why do we need to group?
This is going to be a little bit more difficult, but we’ll answer this question in small steps.
If we list the orders_items
, we see that we have the price
and the quantity
columns.
mysql> select * from order_items;
+----+----------+------------+-------+----------+
| id | order_id | product_id | price | quantity |
+----+----------+------------+-------+----------+
| 4 | 2 | 1 | 50.00 | 2 |
| 5 | 3 | 1 | 50.00 | 1 |
| 6 | 3 | 2 | 19.80 | 1 |
| 7 | 4 | 2 | 19.80 | 5 |
| 8 | 5 | 3 | 5.00 | 1 |
| 9 | 6 | 3 | 5.00 | 2 |
| 10 | 7 | 2 | 19.80 | 1 |
+----+----------+------------+-------+----------+
7 rows in set (0.00 sec)
The total amount of the first order with an id
value of 2
is 2 x 50.00 = 100.00. The total amount of the order with an id
value of 3
is 1 x 50.00 + 1 x 19.80 = 69.80, and the total amount of the order with an id
value of 4
is 5 x 19.80 = 99.00.
This means that the order_items
table has all the necessary information to build a result set like the following.
mysql>
+----------+------------------+
| order_id | amount_per_order |
+----------+------------------+
| 2 | 100.00 |
| 3 | 69.80 |
| 4 | 99.00 |
| 5 | 5.00 |
| 6 | 10.00 |
| 7 | 19.80 |
+----------+------------------+
6 rows in set (0.01 sec)
How can we get there? Let’s try the following first:
mysql> select order_items.order_id, order_items.price * order_items.quantity as amount_per_order_item from order_items;
+----------+-----------------------+
| order_id | amount_per_order_item |
+----------+-----------------------+
| 2 | 100.00 |
| 3 | 50.00 |
| 3 | 19.80 |
| 4 | 99.00 |
| 5 | 5.00 |
| 6 | 10.00 |
| 7 | 19.80 |
+----------+-----------------------+
7 rows in set (0.00 sec)
The result set above doesn’t differ much from the following:
mysql> select * from order_items;
+----+----------+------------+-------+----------+
| id | order_id | product_id | price | quantity |
+----+----------+------------+-------+----------+
| 4 | 2 | 1 | 50.00 | 2 |
| 5 | 3 | 1 | 50.00 | 1 |
| 6 | 3 | 2 | 19.80 | 1 |
| 7 | 4 | 2 | 19.80 | 5 |
| 8 | 5 | 3 | 5.00 | 1 |
| 9 | 6 | 3 | 5.00 | 2 |
| 10 | 7 | 2 | 19.80 | 1 |
+----+----------+------------+-------+----------+
7 rows in set (0.00 sec)
However, there are a few differences, as listed below:
- It doesn’t display the
id
column. - It doesn’t display the
product_id
column. - The two
price
andquantity
columns have been multiplied and now give theamount_per_order_item
.
As we can see, the snippet order_items.price * order_items.quantity as amount_per_order_item
multiplies the values of the two columns and sets a label to the result. This is why the new derived-value column is displayed with the label amount_per_order_item
.
Cool! Now, let’s try another concept:
mysql> select sum(price), sum(quantity) from order_items;
+------------+---------------+
| sum(price) | sum(quantity) |
+------------+---------------+
| 169.40 | 13 |
+------------+---------------+
1 row in set (0.01 sec)
The sum()
function can be used on a column. Then, SQL will sum the values of that column for all the rows that match the result set criteria. The sum()
is an aggregate SQL function. We’ll learn more about that later on.
Show the total quantity of each order
mysql> select order_items.order_id, sum(price), sum(quantity) from order_items group by order_id;
+----------+------------+---------------+
| order_id | sum(price) | sum(quantity) |
+----------+------------+---------------+
| 2 | 50.00 | 2 |
| 3 | 69.80 | 2 |
| 4 | 19.80 | 5 |
| 5 | 5.00 | 1 |
| 6 | 5.00 | 2 |
| 7 | 19.80 | 1 |
+----------+------------+---------------+
6 rows in set (0.00 sec)
This is even better because it applies the aggregate functions (the two sum()
function calls) to all rows that match the criteria, but separately for each order_id
. Double-check the amounts. Do the prices
of order_id 2
sum up to 50.00
? Do the prices
of order_id 3
sum up to 69.80
? What about the sums on the quantity
columns? Do they match?
It seems that we’re getting closer and closer to our target. We’re only left to combine the information from price
and quantity
before summing up. The combination needs to be a multiplication.
Show the total amount of each order
mysql> select order_items.order_id, sum(price * quantity) as amount_per_order from order_items group by order_id;
+----------+------------------+
| order_id | amount_per_order |
+----------+------------------+
| 2 | 100.00 |
| 3 | 69.80 |
| 4 | 99.00 |
| 5 | 5.00 |
| 6 | 10.00 |
| 7 | 19.80 |
+----------+------------------+
6 rows in set (0.00 sec)
Perfect! We now display the sums of prices multiplied by quantities, but we limit the summation/aggregation to the matching order_ids
so that there’s one summation/aggregation for each order_id
.
Practice in the given widget below for better understanding:
Get hands-on with 1400+ tech skills courses.