...

/

An Introduction to pandas

An Introduction to pandas

Explore the basics of pandas library.

In data science tasks in Python, we often use the pandas library for data manipulation, understanding, and analysis. Here, we give a brief introduction to the fundamentals of pandas before getting into the practice challenges.

The DataFrame data structure

A DataFrame is a 2D table of data with rows and columns, like a spreadsheet. Each column in a DataFrame is a Series, and each row has a unique label known as the index. Let‘s see how we can define a DataFrame using a dictionary.

Press + to interact
import pandas as pd
import numpy as np
# first we will create a dictionary
data = {'ids': ["STD_2_145", "STD_2_236", "STD_2_390",
"STD_2_487", "STD_2_569",
"STD_2_672", "STD_2_789",
"STD_2_812", "STD_2_951",
"STD_2_603"],
'science': [70, 78, 82, np.nan, 82, 76, 71, 67, 95, 79],
'english': [64, 75, 43, 76, 42, 77, 88, 56, 87, 90],
'math': [87, 56, 68, 94, 76, 71, 64, 60, 89, 93],
'previous result': ['78%', '75%', '59%', '85%', '70%',
'75%', '60%', '', '76%', '70%']
}
result = pd.DataFrame(data,
columns=['ids', 'science', 'english', 'math', 'previous result'])
print(type(data))
print(type(result))
print("----------------------")
print(result.dtypes)

This code uses the pandas library in Python to create a result DataFrame from a given dictionary data. The dictionary has five keys, each representing a column in the ids, science, english, math, and previous result DataFrame. This dictionary encapsulates student information such as their IDs, grades in science, English, and math courses, along with their previous overall results.

The type() function is used to check the type of data and result. The dtypes attribute is then used to view the data types of each column in the data frame. ...