How to use the loc and iloc functions on a DataFrame in pandas

What is a DataFrame?

A DataFrame is a commonly used 2-dimensional data structure. It is a table that consists of columns and rows and is primarily used as a pandas object.

example dataframe

Requirements

DataFrames require the pandas library, as shown below.


import pandas as pd

Code

A DataFrame can be formed as shown below.

Example

In this example, we create a DataFrame that contains countries that have been put in different groups and are given different a_score and b_score.

Both scores are imaginary values for this example.

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil',
'India', 'Beligium', 'Malaysia', 'Peru',
'England', 'Scotland']
groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C']
df = pd.DataFrame({'group':groups,
'country':country,
'a_score':a_score,
'b_score':b_score})
print(df)

The loc and iloc functions

The loc and iloc functions allow the selection of rows and columns.

  • loc[]: selection by labels

  • iloc[]: selection by positions


Upper boundaries are included when you use loc, and are excluded when you use iloc.

Syntax

The prototypes of the loc and iloc functions are as follows.


df.loc[3:, ['country', 'a_score']]
df.iloc[2:, 3:]

Parameters

  • loc: the labels you want to select

  • iloc: the positions you want to select

Return value

These functions return the filtered values.

Example

The example below selects the first 3 rows and last 2 columns with loc and iloc.

import pandas as pd
a_score = [4, 5, 7, 8, 2, 3, 1, 6, 9, 10]
b_score = [1, 2, 3, 4, 5, 6, 7, 10, 8, 9]
country = ['Pakistan', 'USA', 'Canada', 'Brazil',
'India', 'Beligium', 'Malaysia', 'Peru',
'England', 'Scotland']
groups = ['A', 'A', 'B', 'A', 'B', 'B', 'C', 'A', 'C', 'C']
df = pd.DataFrame({'group':groups,
'country':country,
'a_score':a_score,
'b_score':b_score})
print("loc")
print(df.loc[:2, ['country', 'group']])
print("iloc")
print(df.iloc[:3, 2:])

Free Resources