Explore the Dataset

Learn about customer churn by exploring a sample dataset.

In this lesson, we’ll explore the dataset used to predict customer churn. We will use a semiprocessed Telcom dataset (telco_customer_churn.csv) that comprises 7043 customer subscription details. The dataset has 20 features (both numerical and categorical), and the target label for us is the Churn column. It indicates whether a customer terminates the contract with the Telco company in the following month. Let’s get familiar with the dataset.

Feature Details

Features

Data Type

Details

CustomerID

string

Identifier of a customer

Gender

string

Indicates the gender of the customer (Male, Female)

SeniorCitizen

string

Indicates if the customer is above 65 (Yes, No)

Partner

string

Indicates if the customer has a partner (Yes, No)

Dependents

string

Indicates if the customer has dependents (Yes, No)

Tenure

int64

Number of months the customer is with the company

PhoneService

string

Whether the customer subscribes to phone service (Yes, No)

MultipleLines

string

Whether the customer has multiple phone lines (Yes, No)

InternetService

string

Whether the customer subscribes to internet service (No, DSL, Fiber optic)

OnlineSecurity

string

Whether the customer subscribes to online security (No, No internet service, Yes)

OnlineBackup

string

Whether the customer subscribes to online backups (No, No internet service, Yes)

DeviceProtection

string

Indicates if the customer subscribes to device protection (No, No internet service, Yes)

TechSupport

string

Indicates if the customer subscribes to tech support (No, No internet service, Yes)

StreamingTV

string

Indicates if the customer has a TV streaming service (No, No internet service, Yes)

StreamingMovies

string

Indicates if the customer has a movie streaming service (No, No internet service, Yes)

Contract

string

Indicates the customer’s contract type (Month-to-Month, One Year, Two Year)

PaperlessBilling

string

Indicates if the customer opted for paperless billing (Yes, No)

PaymentMethod

string

Customer's payment method (4 types)

MonthlyCharges

float64

Customer's total monthly service charges

TotalCharges

float64

Customer's total quarterly charges

Churn

int64

Whether the customer terminated the subscription this quarter (0, 1)

Loading the dataset

Let’s load the telco_customer_churn.csv dataset and take a look at the top five records to get a better understanding of the data.

Press + to interact
import pandas as pd
df_telco = pd.read_csv('telco_customer_churn.csv', header=0)
print('Dimension of the dataset:')
print(df_telco.shape)
print('\nTop five records:')
print(df_telco.head())
print('\nColumn types:')
print(df_telco.info())

Explanation

  • Line 1 loads the necessary Python libraries.

  • Line 3 reads the Telco dataset (telco_customer_churn.csv) and creates a pandas ...