...

Indexing and Selection

This lesson will focus on how to view, add, and rename columns of a Pandas dataframe.

We'll cover the following...

- Columns
- - Selecting columns
- - Changing column names
- Rows
- Indexing both rows and columns
- Setting values
- Series vs Dataframe

Indexing is the technique of efficiently retrieving records from data based on some criteria that the data has been arranged by. As we saw in the previous lesson, the data is organized in rows and columns in a dataframe. So, we can index data using the positions and names of these rows and columns. Now let’s see how to select rows and columns from the data.

Columns

To view the names of the columns we use df.columns.values. We will be using the file housing.csv.

housing.csv

In Machine Learning terminology, a column in a spreadsheet is referred to as a feature, while in Statistics it is referred to as a variable. It is also referred to as an attribute. We will be using all of these terms interchangeably in this course.

Press + to interact

Python 3.5

import pandas as pd
df = pd.read_csv('housing.csv')
# Print Column names
print(df.columns.values)  
# Number of columns
num = len(df.columns)
print("number of columns: ",num)

Selecting columns

Let’s see how we can select the data of a few columns of housing.csv.

Press + to interact

Python 3.5

import pandas as pd 
df = pd.read_csv('housing.csv')
new_df = df['population']
print(new_df.head())
# Make a list of the columns to select
col_to_select = ['longitude','latitude','population', 'ocean_proximity']
# Collect the specified columns and print them
new_df = df[col_to_select]
print('\n\n',new_df.head())

We view the values in a column by simply typing their name as we did in line 4. In line 8, we create a list of columns that we want to select. In line 11, we retrieve those columns out of the dataframe df and save it as a new dataframe called new_df. Line 12 prints the head of the dataframe.

Changing column names

...

What is Data Science

Python Basics

Handling Tabular Data in Python

Data Cleaning

Exploratory Data Analysis

Statistical Inference

Predictive Models

Machine Learning

How to Predict the Traffic Volume Using Machine Learning

Indexing and Selection

Columns

Selecting columns

Changing column names