A database is a collection of structured data stored in an organized manner. PostgreSQL, a relational database management system, employs SQL for fundamental operations such as creating tables, inserting data into tables, modifying the records, and deleting the existing data from the tables. Understanding SQL keywords is crucial for being skilled in SQL. One such term is the ROW_NUMBER()
function, which is an analytical function useful for organizing and sorting data inside result sets.
ROW_NUMBER()
function in PostgreSQLThe ROW_NUMBER()
window function assigns a unique row number to each row within a result set. It is important for various tasks, including ranking, data partitioning, and paging. Its importance lies in its ability to bring order to unordered data, allowing us to address common analytical challenges, such as identifying the top performers in a sales report.
ROW_NUMBER() OVER (PARTITION BY column_nameORDER BY column_name)
Line 1: The ROW_NUMBER()
function assigns a unique row number to each row. The OVER
clause defines the window specification for the function.
Line 2: The PARTITION BY
enables us to partition the result set into subsets based on a specified column column_name
.
Line 3: The ORDER BY
clause determines the order in which rows for the column, column_name
, are numbered (for example, 1, 2, 3, 4, 5) within each partition.
Let’s use a simple database named Employees
having four columns, employee_id
, employee_name
, date_of_joining
, and sales_revenue
.
--Creating Employees tableCREATE TABLE IF NOT EXISTS Employees (emp_id SERIAL PRIMARY KEY,emp_name TEXT,date_of_joining DATE,sales_revenue DECIMAL(10, 2),country TEXT);--Inserting data into Employees tableINSERT INTO Employees (emp_name, date_of_joining, sales_revenue, country)VALUES('Usama Khan', '2022-01-15', 1500.00, 'Brazil'),('Fatima Ahmed', '2021-11-10', 2200.50, 'USA'),('Hassan Ali', '2021-12-05', 1800.25, 'Pakistan'),('Saira Khan', '2022-02-20', 1300.75, 'Australia'),('Ahmad Malik', '2022-03-15', 2500.80, 'Brazil'),('Bilal Ahmed', '2022-04-10', 1950.30, 'USA'),('Noor Fatima', '2022-01-01', 1620.00, 'Pakistan'),('Sarah Harris', '2021-11-30', 1850.50, 'Australia'),('Aamir Malik', '2022-05-20', 1380.25, 'Brazil'),('Ali Hassan', '2021-12-25', 1100.75, 'USA'),('Ayesha Siddiqui', '2022-03-10', 2650.80, 'Pakistan'),('Usman Ahmed', '2022-04-05', 1820.30, 'Australia');SELECT * FROM Employees;
Lines 1–7: We create a table named Employees
with four columns: the emp_id
(SERIAL
, PRIMARY KEY
), emp_name
, date_of_joining
, and sales_revenue
.
Lines 9–24: We insert a few records into the Employees
table using the INSERT INTO
command.
Line 26: We display the data using the SELECT
statement.
Let's see the different aspects of the ROW_NUMBER()
function through some examples.
ORDER BY
clause in the ROW_NUMBER()
functionIn this example, we’ll use the ROW_NUMBER()
function to rank employees based on their hiring dates, providing a clear understanding of who joined the company first.
SELECT emp_name,sales_revenue,ROW_NUMBER() OVER (ORDER BY sales_revenue DESC) AS rank_by_sales_revenueFROM Employees;
Lines 1–2: We use the SELECT
statement to read the emp_name
and sales_revenue
.
Lines 3–4: We use the ROW_NUMBER()
function to assign a unique integer value to each row, starting from 1 and incrementing by 1 for each subsequent row. The ORDER BY
clause orders the rows based on the sales_revenue
column in descending order.
Line 5: We name the result of this calculation rank_by_sales_revenue
.
Lines 6: We use the Employees
table for data.
PARTITION BY
clause in the ROW_NUMBER()
functionIn this example, we’ll use the ROW_NUMBER()
function to partition data by country and then rank employees within each country based on their sales revenue.
SELECT emp_name,country,sales_revenue,ROW_NUMBER() OVER (PARTITION BY countryORDER BY sales_revenue DESC) AS rank_within_countryFROM Employees;
Lines 1–3: We use the SELECT
statement to read emp_name
, country
, and sales_revenue
.
Lines 4–6: We use the ROW_NUMBER()
function to assign a unique integer value to each row within each partition, starting from 1 and incrementing by 1 for each subsequent row. The PARTITION BY
clause creates partitions or groups within the result set based on the country
column. The ORDER BY
clause orders the rows within each country based on the sales_revenue
column in descending order.
Line 7: We name the result of this calculation rank_within_country
.
Lines 8: We use the Employees
table for data.
ROW_NUMBER()
functionIn this example, we’ll use a ROW_NUMBER()
function for more complex ranking tasks, such as calculating cumulative ranks.
WITH RankedEmployees AS (SELECT emp_name,sales_revenue,ROW_NUMBER() OVER (ORDER BY sales_revenue DESC) AS rank_by_sales_revenueFROM Employees)SELECT emp_name,sales_revenue,rank_by_sales_revenue,SUM(rank_by_sales_revenue) OVER (ORDER BY sales_revenue DESC) AS cumulative_rankFROM RankedEmployees;
Lines 1–8: We create a RankedEmployees
CTE to rank employees based on the sales_revenue
column.
Lines 9-11: We use the SELECT
statement to read the emp_name
, sales_revenue
, and rank_by_sales_revenue
.
Lines 12-13: We use the SUM()
function to calculate the cumulative ranks. The ORDER BY
clause orders the rows within each country based on the sales_revenue
column in descending order.
Line 14: We name the result of this calculation cumulative_rank
.
Line 15: The data uses the result set of the RankedEmployees
CTE.
The ROW_NUMBER()
window function assigns a unique row number to each row (starting from 1 and incrementing by 1 for each subsequent row) within a result set. We can use it to rank a single attribute. We can also use it to rank rows in the partitions. Moreover, we can use this window function to do complex calculations using common table expressions. The ROW_NUMBER()
window function is essential because it unlocks advanced capabilities for querying and analyzing data.
Free Resources