Explore the Dataset
Learn about customer churn by exploring a sample dataset.
In this lesson, we’ll explore the dataset used to predict customer churn. We will use a semiprocessed Telcom dataset (telco_customer_churn.csv
) that comprises 7043 customer subscription details. The dataset has 20 features (both numerical and categorical), and the target label for us is the Churn
column. It indicates whether a customer terminates the contract with the Telco company in the following month. Let’s get familiar with the dataset.
Feature Details
Features | Data Type | Details |
CustomerID | string | Identifier of a customer |
Gender | string | Indicates the gender of the customer (Male, Female) |
SeniorCitizen | string | Indicates if the customer is above 65 (Yes, No) |
Partner | string | Indicates if the customer has a partner (Yes, No) |
Dependents | string | Indicates if the customer has dependents (Yes, No) |
Tenure | int64 | Number of months the customer is with the company |
PhoneService | string | Whether the customer subscribes to phone service (Yes, No) |
MultipleLines | string | Whether the customer has multiple phone lines (Yes, No) |
InternetService | string | Whether the customer subscribes to internet service (No, DSL, Fiber optic) |
OnlineSecurity | string | Whether the customer subscribes to online security (No, No internet service, Yes) |
OnlineBackup | string | Whether the customer subscribes to online backups (No, No internet service, Yes) |
DeviceProtection | string | Indicates if the customer subscribes to device protection (No, No internet service, Yes) |
TechSupport | string | Indicates if the customer subscribes to tech support (No, No internet service, Yes) |
StreamingTV | string | Indicates if the customer has a TV streaming service (No, No internet service, Yes) |
StreamingMovies | string | Indicates if the customer has a movie streaming service (No, No internet service, Yes) |
Contract | string | Indicates the customer’s contract type (Month-to-Month, One Year, Two Year) |
PaperlessBilling | string | Indicates if the customer opted for paperless billing (Yes, No) |
PaymentMethod | string | Customer's payment method (4 types) |
MonthlyCharges | float64 | Customer's total monthly service charges |
TotalCharges | float64 | Customer's total quarterly charges |
Churn | int64 | Whether the customer terminated the subscription this quarter (0, 1) |
Loading the dataset
Let’s load the telco_customer_churn.csv
dataset and take a look at the top five records to get a better understanding of the data.
import pandas as pddf_telco = pd.read_csv('telco_customer_churn.csv', header=0)print('Dimension of the dataset:')print(df_telco.shape)print('\nTop five records:')print(df_telco.head())print('\nColumn types:')print(df_telco.info())
Explanation
Line 1 loads the necessary Python libraries.
Line 3 reads the Telco dataset (
telco_customer_churn.csv
) and creates a pandas ...