The pandas library in Python is a robust and powerful tool for data analysis. In this shot, we will go over some ways to utilize the library to create a new column in an existing data frame.
The simplest way to add a new column to an existing panda’s data frame is to index the data frame with the new column’s name and assign a list to it:
import pandas as pd# Create a new DataFramedf = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],'Age': [34, 26, 56, 44],'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})print("Dataframe before adding new column:")print(df)# Adding salary column by indexing and assigning a listdf['salary'] = [200000, 70000, 110000, 670000]print("Dataframe after adding new column:")print(df)
Another way of introducing a column in the data frame is by using the in-built assign
method, which creates a new data frame with the added column. The Python code below shows how this can be done:
import pandas as pd# Create a new DataFramedf = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],'Age': [34, 26, 56, 44],'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})print("Dataframe before adding new column:")print(df)# Adding salary column using the assign methoddf2 = df.assign(salary = [200000, 70000, 110000, 670000])print("Dataframe after adding new column:")print(df2)
The insert
method is another useful data frame method that can be used to create a new column. Unlike the previous techniques, which simply appended a column to the end of the data frame, the insert
method allows you to add the new column in any specified position. Here’s how the method is used:
import pandas as pd# Create a new DataFramedf = pd.DataFrame({'Name': ['Ali', 'Aqsa', 'Armaan', 'Arij'],'Age': [34, 26, 56, 44],'Position': ['Senior Engineer', 'Junior Engineer', 'HR Officer', 'COO']})print("Dataframe before adding new column:")print(df)# Adding salary column to the first index using the insert methoddf.insert(1, "salary", [200000, 70000, 110000, 670000])print("Dataframe after adding new column:")print(df)