Filtering with "loc" and "iloc" Methods

Learn about the loc and iloc methods of Pandas.

We'll cover the following

The loc and iloc methods

The loc and iloc methods are essential Pandas methods used for filtering, selecting, and manipulating data. They allow us to access the desired combination of rows and columns.

The main difference between them is the way they access rows and columns:

  • loc uses row and column labels.
  • iloc uses row and column indexes.

Consider the DataFrame shown in the following illustration. To access the rows and columns using the iloc method, we use the index values (0, 1, and 2). If we want to use the loc method, we need to use the labels A, Day 1, and so on).

Let’s first use the loc method to select the first five rows and two columns in the sales.

Press + to interact
import pandas as pd
sales = pd.read_csv("sales.csv")
print(sales.loc[:4, ["product_code","product_group"]])

The :4 is the equivalent of 0:4 and it indicates the rows starting from 0 to 4. The column names are passed as a list to the loc method. Let’s do the same operation using the iloc method.

Press + to interact
import pandas as pd
sales = pd.read_csv("sales.csv")
print(sales.iloc[[5,6,7,8], [0,1]])
print(sales.iloc[5:9, :2])

The code in lines 5 and 7 do the same thing. The 5:9 is more convenient than passing the indices in a list as [5,6,7,8], though.

Notice that we selected rows in the same way for both loc and iloc. This is because Pandas assigns integer labels to rows by default. Unless we specify otherwise, row indexes and labels will be the same. Let’s create a data frame with a custom index to see the difference more clearly.

The loc and iloc methods are frequently used to select or extract a part of a data frame. The main difference is that loc works with labels whereas iloc works with indices.

Press + to interact
import numpy as np
import pandas as pd
df = pd.DataFrame(
np.random.randint(10, size=(4,4)),
index = ["a","b","c","d"],
columns = ["col_a","col_b","col_c","col_d"]
)
print(df)
print("\nSelect two rows and two columns using loc:")
print(df.loc[["b","d"], ["col_a","col_c"]])

The loc and iloc methods are frequently used for selecting or extracting a part of a data frame. The main difference is that loc works with labels whereas iloc works with indices.