How to optimize SQL Queries using Indexes

Index in SQL is a lookup table that enables us to quickly retrieve data from the database. The index is applied to column(s) in a table.

Indexes help us by making search and data retrieval queries faster. SQL can easily identify the row(s) using the index and quickly fetch the data we need.

However, on every INSERT/UPDATE query on the table, the indexes associated with that table are also updated, which make the writes to the table slower. With multiple indexes, things can become pretty bad.

Ideally, we want to have a low number of indexes that optimize most of the data retrieval we need.

Composite index

As discussed earlier, it would not be efficient to create a separate index for each column. So, instead, we can set up a composite index.

Composite or multi-column indexes sound great, but they have one disadvantage: the order of the columns in the index is critical. For every column in the index, a column on its left must be included in the query.

We start from the leftmost column, and although we cannot skip over columns, we can ignore columns on the right.

For example:

INDEX(col_1, col_2, col_3 ... col_n)

The query, Q(col_1, col_2, col3) would use the index, even though columns on the right of col_3 are not in the query. However, the query, Q(col_2, col_3, ..., col_n) would not be able to.

Let’s see another example:

CREATE TABLE persons(
	id INT NOT NULL,  
  	first_name VARCHAR(15) NOT NULL,
  	last_name VARCHAR(15) NOT NULL,
  	age INT NOT NULL,
	address VARCHAR(64) NOT NULL,
  	PRIMARY KEY(id)
);
/* Multi Column Index */
CREATE INDEX person_info ON persons (first_name, last_name, age);
/* Following Query Uses above Index */
explain SELECT * FROM persons WHERE first_name = "Donald" and last_name = "Qualls" and age = 16 \G
/* Unable to use the above Index, because we skipped first_name */
SELECT "";
explain SELECT * FROM persons WHERE  last_name="Qualls" and age > 20 \G
/* Uses Index as only Columns on Right of the last column used are ignored */
SELECT "";
explain SELECT * FROM persons WHERE  first_name="Donald" \G

CREATE TABLE Vehicles(
	id INT NOT NULL,  
  	name VARCHAR(15) NOT NULL,
  	model VARCHAR(15) NOT NULL,
  	year INT NOT NULL,
    price INT NOT NULL,
  	PRIMARY KEY(id)
);
/* Multi Column Index */
CREATE INDEX vehicle_info ON Vehicles (name, year, model);
/* Covering, as All information is present in Index*/
explain SELECT name, model, year FROM Vehicles
        WHERE name = 'Nissan' and model = 'modelA' and year = 2018 \G
/* Not Covering, price is not in index*/
explain SELECT name, model, year, price FROM Vehicles
        WHERE name = 'Nissan' and model = 'modelA' and year = 2018 \G
/* Using Where with Using Index, Why? Due to Range Condition on year, model column is not optimized */
explain SELECT name, model, year FROM Vehicles
        WHERE name = 'Nissan' and model = 'modelA' and year > 2018 \G

Final thoughts

Always use indexes on columns with good Selectivity.

Selectivity is defined as the (no. of distinct values * 100) / total values. For example, the Index on the Gender column has poor selectivity as it mostly has two values. Suppose there are 10,000 records. Then, the selectivity of this column would be 0.02%.

The more distinct values the column has, the better it will perform. For composite-index, try to keep the columns with high selectivity on the left (or start).

Avoid redundant indexes.

For example,INDEX (A, B) and INDEX(A) are redundant as A is leftmost in both and we can remove them later.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

How to optimize SQL Queries using Indexes

Composite index

Covering index

Final thoughts