Aggregate functions vs. window functions

In SQL, aggregate and window functions serve distinct purposes, catering to different analytical needs. Let’s look into the definitions and differences of these powerful tools, accompanied by concise yet comprehensive examples.

Aggregate functions

Aggregate functions operate on a set of values and return a single value summarizing that set. They often provide insights into the overall characteristics of a dataset. Common examples include:

  • COUNT()

  • AVG()

  • SUM()

  • MIN()

  • MAX().

Imagine we have the following dummy data for an orders table:

-- Creating the 'orders' table with dummy data
CREATE TABLE orders (
order_id INT,
order_category VARCHAR(50),
order_amount DECIMAL(10, 2),
order_date DATE
);
INSERT INTO orders (order_id, order_category, order_amount, order_date)
VALUES
(1, 'Electronics', 1200.00, '2023-01-01'),
(2, 'Clothing', 450.50, '2023-01-02'),
(3, 'Electronics', 800.00, '2023-01-03'),
(4, 'Clothing', 600.75, '2023-01-04'),
(5, 'Books', 300.25, '2023-01-05'),
(6, 'Books', 250.50, '2023-01-06');
Dummy data for an `orders` table

Example 1: Calculating the total number of orders

SELECT COUNT(order_id) AS total_orders
FROM orders;
  • Lines 1–2: We use the COUNT() aggregate function to count the number of order_id entries in the orders table, labeling the result as total_orders.

Example 2: Calculating the average order amount

SELECT AVG(order_amount) AS avg_order_amount
FROM orders;
  • Lines 1–2: We use the AVG() aggregate function to calculate the average order_amount in the orders table, labeling the result as avg_order_amount.

Example 3: Calculating the total order amount

SELECT SUM(order_amount) AS total_order_amount
FROM orders;
  • Lines 1–2: We use the SUM() aggregate function to calculate the total order_amount in the orders table, labeling the result as total_order_amount.

Example 4: Finding the minimum order amount

SELECT MIN(order_amount) AS min_order_amount
FROM orders;
  • Lines 1–2: Here, we use the MIN() aggregate function to find the smallest order_amount in the orders table, labeling the result as min_order_amount.

Example 5: Finding the maximum order amount

SELECT MAX(order_amount) AS max_order_amount
FROM orders;
  • Lines 1≠2: Similarly, we use the MAX() aggregate function to find the largest order_amount in the orders table, labeling the result as max_order_amount.

Aggregate functions are particularly useful to derive insights that provide a high-level dataset overview, such as counts, averages, or other summary statistics.

Window functions

In contrast, window functions work with a specific set of rows that are related to the current row in the query result set. They allow for more granular calculations, introducing the concept of a “window” or a subset of the data for computation. Common window functions include:

  • ROW_NUMBER()

  • RANK()

  • LEAD()

  • LAG()

  • SUM()

  • AVG()

Example 1: Assigning a unique row number to each order in a specific category

SELECT order_id, order_category,
ROW_NUMBER() OVER (PARTITION BY order_category ORDER BY order_date)
AS category_row_number
FROM orders;
  • Lines 1–4: We use the ROW_NUMBER() window function to assign a unique row number within each order_category partition, ordered by order_date.

Example 2: Ranking orders based on order amount

SELECT order_id, order_amount, RANK() OVER (ORDER BY order_amount) AS order_rank
FROM orders;
  • Lines 1–2: Here, we use the RANK() window function to assign a rank to each order based on the order_amount, ordered in ascending order.

Example 3: Showing the previous order amount for each order

SELECT order_id, order_amount, LAG(order_amount) OVER (ORDER BY order_date) AS previous_order_amount
FROM orders;
  • Lines 1–2: This query utilizes the LAG() window function to display the order_amount from the previous order, ordered by order_date.

Example 4: Showing the next order amount for each order

SELECT order_id, order_amount, LEAD(order_amount) OVER (ORDER BY order_date) AS next_order_amount
FROM orders;
  • Lines 1–2: Similarly, we use the LEAD() window function to display the order_amount from the next order, ordered by order_date.

Example 5: Calculating the running total of order amounts

SELECT order_id, order_amount, SUM(order_amount) OVER (ORDER BY order_date) AS running_total
FROM orders;
  • Lines 1–2: We use the SUM() window function to calculate the running total of order amounts, ordered by order_date.

Example 6: Calculating the average order amount considering previous orders

SELECT order_id, order_amount, AVG(order_amount) OVER (ORDER BY order_date) AS avg_order_amount
FROM orders;
  • Lines 1–2: In this example, we use the AVG() window function to calculate the average order_amount up to the current order, ordered by order_date.

Window functions shine when you need to perform calculations over a specific range or partition of the dataset, such as computing running totals, ranking, or identifying distinct partitions within the data.

Comparison between the two

We use aggregate functions when we need a quick overview or summary of our data, such as counting occurrences or computing averages. We utilize window functions when we require detailed computations within specific subsets or partitions of our data, like calculating running totals or assigning row numbers within groups. Below is a feature by feature comparison of both function types:

Feature

Aggregate Functions

Window Functions

Operating Principle

Summarize values across the entire dataset.

Operate on a specific range of rows related to the current row.

Scope

Provide summary statistics for the entire dataset.

Enable calculations within specific subsets or partitions of data.

Example

SELECT COUNT(column_name) FROM table;

SELECT ROW_NUMBER() OVER (PARTITION BY column_name ORDER BY order_date) FROM table;

Common Functions

COUNT(), AVG(), SUM() , MIN(), MAX()

ROW_NUMBER(), RANK(), LEAD(), LAG(), SUM(), AVG()

Usage

Best for providing high-level summaries.

Ideal for detailed calculations within specific partitions or windows of data.

Both aggregate and window functions have SUM() and AVG().

  • In aggregate functions, they compute totals and averages across the entire dataset.

  • In window functions, they enable running totals and moving averages within specific windows of data.

Conclusion

In summary, while aggregate functions provide overarching insights into the entire dataset, window functions offer a more nuanced and granular approach, allowing for detailed computations over specific windows or partitions. Knowing when to leverage each is crucial for constructing effective SQL queries tailored to your analytical requirements.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved