Ready to take your SQL skills to the next level? Build a Python Airline Reservation System to manage flights, customers, and schedules.
Key takeaways:
SQL subqueries enable complex filtering, aggregation, and comparison operations within a single statement, making data extraction more precise and efficient
Subqueries come in various types, including single-row, multiple-row, multiple-column, correlated, and nested, each tailored for specific use cases and complexities in database queries.
Subqueries promote cleaner and more modular SQL code by breaking down complex logic into smaller, manageable pieces that are easier to debug and maintain.
While subqueries are powerful, they can be resource-intensive, especially correlated subqueries. Understanding when to use joins or common table expressions (CTEs) instead can help improve query performance.
As data volumes and complexity grow, writing efficient SQL queries is more important than ever. Subqueries, known as nested queries, allow you to break down complex problems into smaller, more manageable parts by embedding one query inside another. They’re essential for tasks like filtering data, performing calculations, and joining tables—making them a powerful tool for writing scalable SQL queries.
SQL subqueries come in different types, each serving a unique purpose:
Scalar subqueries return a single value.
Multiple-row subqueries return a set of rows.
Correlated subqueries compare values between two tables.
In this blog, we’ll focus on SQL subqueries in Microsoft SQL Server, covering their syntax, types, practical applications, and optimization strategies to help you write better-performing queries.
The following snippet depicts a general syntax for subqueries:
SELECT column1, column2FROM main_tableWHERE columnN operator ( SELECT expressionFROM subquery_tableWHERE condition);
In this syntax:
main_table
is the main table from which we are selecting data.
column1
and column2
are the columns we want to retrieve from the main table.
columnN
is the name in the main table that we want to compare with the subquery result.
operator
is a comparison operator such as =
, >
, <
, IN
, EXISTS
, etc.
subquery_table
is the table in the subquery that is used to retrieve data.
expression
is the value or set of values we compare against in the outer query.
condition
is the condition used to filter data in the subquery.
Here are a few rules/guidelines to remember when dealing with subqueries:
A subquery must be enclosed in parentheses.
A subquery must be placed on the right side of the comparison operator.
Subqueries cannot manipulate their results internally. Therefore, the ORDER BY
clause cannot be added to a subquery.
Subqueries can be used with comparison operators such as >
, <
, >=
, <=
, =
, or <>
.
Subqueries can be used with the IN
or NOT IN
operator.
Subqueries can be used with the EXISTS
or NOT EXISTS
operator.
Use table aliases or fully qualified column names to prevent ambiguity, especially when dealing with subqueries within larger queries.
There are five major types of subqueries mentioned below. Let’s go through them individually, along with some code examples.
A single-row subquery returns zero or one row in the result set. It is employed when we want to compare a single value from a subquery with the result of the outer query.
SELECT student_nameFROM studentsWHERE age > (SELECT Avg(age)FROM studentsWHERE department = 'Computer Science');
This subquery from lines 3–5 above calculates the average age of computer science students, and then the outer query retrieves names of only those students whose age is greater than that average.
A multiple-row subquery returns one or more rows in the result set. It is useful when comparing or evaluating multiple values from the subquery with the outer query.
SELECT product_nameFROM productsWHERE category_id IN (SELECT category_idFROM categoriesWHERE is_discounted = true);
The subquery from lines 3–5 above finds the category IDs for discounted products, and the outer query retrieves product names in those categories.
A multiple-column subquery returns one or more columns. It is applied when the result set requires different variables or multiple information from the subquery.
SELECT employee_name,department_nameFROM employeesWHERE ( department_id, salary ) IN (SELECT department_id,Max(salary)FROM employeesGROUP BY department_id);
The subquery from lines 4–7 above finds the department-wise maximum salary, and the outer query gets the employee names and their respective departments for those maximum salaries.
Correlated subqueries refer to columns from the outer query and return one or more columns according to the main or the outer query.
SELECT project_name,deadlineFROM projects outer_projectWHERE deadline <= ALL (SELECT deadlineFROM projectsWHERE client_id = outer_project.client_id);
The above subquery from lines 4–6 correlates to the outer query by referencing the client ID. It checks if the deadline of the current project is earlier than or equal to all other projects from the same client. The outer query then retrieves project names and deadlines that satisfy this condition.
Nested subqueries involve queries within a query (inner and outer query). The inner query is executed first, and its result is used by the outer query, allowing for more complex and layered operations.
SELECT employee_name,department_nameFROM employeesWHERE department_id IN (SELECT department_idFROM departmentsWHERE location_id IN (SELECT location_idFROM locationsWHERE country_id = 'US'));
The innermost subquery from lines 6–8 retrieves location IDs for the US country ID. The middle subquery from lines 4–6 then finds department IDs associated with those locations. The outer query from lines 1–4 retrieves employee names and their respective departments for those in the US locations.
An Introductory Guide to SQL
The ability to work SQL is becoming an increasingly in-demand skill, both for software developers and people in less technical roles. If you’re interested in learning SQL and have no prior experience with it, then this course will be your light in a dark tunnel. You’ll start by covering the basics of SQL such as how to create a database, how to insert, query, and update data. You’ll also learn fundamental concepts that developers and data scientists use everyday such as multi-table operations, nested queries, and how to set up views. Throughout, you’ll get to execute SQL queries in your browser and see results in real-time - you won’t need to worry about set-up. At the end of this course, you’ll also get some hands-on practice with common SQL interview questions, so when the time comes, you’ll be ready and confident to answer any question that comes your way. Let’s get started!
Here’s a comparative table highlighting the types of subqueries and their key characteristics.
Type of Subquery | Definition | Key Aspects | Best Use Case | Real-World Example |
Single-Row Subquery | Returns zero or one row in the result set. | - Uses comparison operators ( - Returns scalar value. | Comparing a single value, such as filtering data based on aggregate calculations. | Which employees earn more than the average salary in IT? |
Multiple-Row Subquery | Returns one or more rows in the result set. | - Uses - Supports multi-value comparisons. | Comparing a column against a set of values, e.g., filtering products or categories with specific criteria. | Which customers bought products from categories on sale? |
Multiple-Column Subquery | Returns one or more columns in the result set. | - Often paired with tuples for multiple-column filtering. - Returns rows of composite data. | Fetching combinations of related values, e.g., department-wise top salaries. | List all managers and their departments where they have the highest salary. |
Correlated Subquery | Refers to columns from the outer query and executes once per row of the outer query. | - Requires a link to the outer query. - Can be less efficient due to row-by-row execution. | Complex filtering that depends on values from the outer query, e.g., comparing deadlines within the same client. | List products with a lower discounted price than the average price in their category. |
Nested Subquery | Contains queries within a query, where the outer query uses the inner query result. | - Can contain multiple levels of subqueries. - Executes from the innermost query outward. | Multi-layered filtering or aggregation, e.g., retrieving employees from departments in specific countries. | Find employees in departments situated in cities within the US. |
DML stands for data manipulation language. It essentially refers to the subset of SQL that manipulates data stored in a database, such as inserting, updating, and deleting records. Let’s take a look at subqueries in the context of DML operations.
Subqueries can be used with INSERT
statements to add data based on the results of another query. For instance, to insert students into a new class, we do the following:
INSERT INTO students(name,age,class_id)SELECT name,age,(SELECT idFROM classesWHERE class_name = 'Mathematics')FROM studentsWHERE class_id = (SELECT idFROM classesWHERE class_name = 'Physics');
We can even modify records based on the results of a nested query using UPDATE
statements. To update grades for students who are in the Computer Science department, we do the following:
UPDATE studentsSET grade = grade + 5WHERE department_id = (SELECT idFROM departmentsWHERE dept_name = 'Computer Science');
Subqueries can also be applied with DELETE
statements to remove records based on the results of a nested query. To delete students who have not attended any class, we do the following:
DELETE FROM studentsWHERE id NOT IN (SELECT DISTINCT student_idFROM class_attendance);
Let’s take a look at a few use cases of subqueries below.
Filtering data: We can use a subquery to filter data based on a condition. For example, we can use a subquery to find all customers who have placed an order in the last 30 days:
SELECT *FROM customersWHERE customer_id IN (SELECT customer_idFROM ordersWHERE order_date >= Dateadd(day, -30, Getdate()));
Performing calculations: We can also use a subquery to perform calculations on a dataset. For example, using a subquery to find the average price of all products in a category:
SELECT category_name,Avg(price) AS avg_priceFROM productsWHERE category_id = (SELECT category_idFROM categoriesWHERE category_name = 'Electronics')GROUP BY category_name
Joining tables: Another use case is to utilize a subquery to join two tables together. For example, we can use a subquery to find all customers who have placed their most recent orders after a specific date:
SELECT c.customer_id, c.customer_name, o.order_id, o.order_dateFROM customers cJOIN orders oON c.customer_id = o.customer_idWHERE o.order_id IN (SELECT MAX(order_id)FROM ordersWHERE order_date > '2025-01-01'GROUP BY customer_id);
Subqueries are a powerful feature in SQL, but they can sometimes slow down your queries, especially when working with large tables or complex conditions. To write faster and more efficient queries, let's understand how subqueries work and when alternatives like JOINs or common table expressions (CTEs) might be better.
Subquery execution: Subqueries, especially correlated ones, might run multiple times—once for each row in the outer query. This repetition can make your queries slower if there’s a lot of data. To overcome this:
Optimize correlated subqueries by ensuring the filters and conditions are as specific as possible.
Replace correlated subqueries with JOINs or CTEs (common table expressions) whenever feasible, as these approaches often execute faster.
Use
Using indexes: Subqueries work best when the columns they use are indexed. Without indexes, the database has to search through all the rows, which takes more time.
It’s important to note that while MySQL can leverage indexes effectively in subqueries, correlated subqueries may still perform poorly because MySQL often lacks sophisticated optimization for them, unlike some other DBMSs.
Sometimes, JOINs or CTEs can replace subqueries and make your query faster or easier to read:
JOINs: JOINs combine data from multiple tables in a single step, often faster than running a subquery for every row. Use JOINs when you need to compare or combine data across tables.
CTEs (common table expressions): CTEs let you create a temporary table inside your query. This helps break down complex queries into smaller, easier-to-understand steps or for reusing a result in different parts of the same query.
Keep subqueries small: Filter the data in your subquery to reduce the number of rows it processes.
Avoid repeating work: Use JOINs or CTEs instead of correlated subqueries, which are slower because they run repeatedly.
Check your query plan: Tools like EXPLAIN
Use your database’s features: Some databases have special ways to handle subqueries more efficiently, so check their documentation. For example:
PostgreSQL:
PostgreSQL’s query planner effectively optimizes subqueries, especially correlated ones. When using subqueries in PostgreSQL, its advanced optimization techniques and detailed EXPLAIN ANALYZE
to identify how queries are being executed.
MySQL:
While MySQL can handle subqueries well, it benefits most from indexed subqueries. In particular, subqueries using IN
or EXISTS
are more efficient when indexes are applied to the columns. In cases of correlated subqueries, MySQL generally performs better with JOINs
.
SQL Server:
SQL Server provides a sophisticated query optimization engine that works efficiently with subqueries and JOINs
. When replacing subqueries with JOINs
or CTEs
, SQL Server often executes them faster due to its powerful optimization features.
That’s all there is to know about subqueries to effectively get the hang of them. By intelligently choosing between subqueries, JOINs, and CTEs, you can make your queries efficient and easy to understand. They are a crucial topic in database systems; you will see them in coding interviews. Therefore, mastering subqueries is a game-changer for writing efficient SQL. Keep experimenting and optimizing!
Want to build a strong foundation in SQL and relational databases? Start this hands-on course today to master table management, complex subqueries, and joins using MySQL.
Relational databases store data in a row-based table structure. Structured Query Language (SQL) is a core language used to retrieve data from the relational database. One of its most widely used extensions, MySQL, also gives you the power to edit, create, and manage the queried data. In this course, you’ll be introduced to the basics of relational database management with SQL. You’ll use MySQL to create a table, edit the data, and even change the structure. You’ll also create indexes to quickly access data for faster performance on common queries. Additionally, you’ll explore more complex queries to extract useful data with aggregate functions, grouping, ordering, and creating limits and filters in output tables. Finally, you’ll learn how to combine data from different tables with join functions. By the end of this course, you’ll be able to create, manage, edit, and merge relational databases with confidence. MySQL will prepare you for data projects large and small.
What are the types of subqueries in SQL?
What is the difference between a correlated subquery and a nested subquery?
Can a subquery return multiple values?
Can subqueries affect SQL query performance?
When should I use a subquery instead of a JOIN in SQL?
Free Resources