Useful Methods and Operations
Let’s learn about some useful methods and selecting data from DataFrames using pandas.
Useful methods
There are lots of options to explore that are available in pandas, including describe()
, which gives the basic statistics on our data. We have already covered some of them, like head()
, isnull()
, dropna()
, fillna()
, and others.
In this lesson, we’ll explore some more general-purpose operations and revise what we’ve learned in the previous lessons.
Let’s create a DataFrame to get hands-on experience with these operations. We’ll repeat some values and also generate NaN
in our DataFrame.
import numpy as npimport pandas as pd# creating data dictionarydata_dict = {'col_1':[1,2,3,4,5],'col_2':[111,222,333,111,555],'col_3':['alpha','bravo','charlie',np.nan,np.nan],}# Creating DataFrame from data_dictdf = pd.DataFrame(data_dict,index=[1,2,3,4,5])print(df) # let's have a look at what the DataFrame looks like!
The info()
method
The info()
method provides a concise summary of a DataFrame. We’ll use this function often in the course.
print(df.info())
The head(n)
method
The head(n)
method returns the first n
rows of a DataFrame. Its default value is 5. This method is very useful for getting an overview of our data. We’ll use this often in the course.
#Default Callprint(df.head())print()#Call with parameterprint(df.head(2))
The isnull()
method
The isnull()
method returns a same-sized boolean object that indicates whether the values are null or not.
print(df.isnull())
The dropna()
method
The dropna()
method removes the rows that contains NULL
values.
By default, this method works with the following conditions in place:
axis = 0/rows
,1/columns
– 0 is default