...

/

Exploring Numerical Quantities

Exploring Numerical Quantities

This lesson will focus on exploring the numerical quantities and finding out general trends from these quantities.

We'll cover the following...

A very important part of exploratory data analysis is finding out general trends or patterns in the data. We can find out different relationships between two quantities that can be very helpful in making decisions at the end. We will use the cleaned version of the dataset from the lesson Inconsistent Data. The details of individual columns are mentioned below.

Press + to interact
# Default of Credit Card Clients Dataset
# There are 25 variables:
# ID: ID of each client
# LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit
# GENDER: Gender (male,female)
# EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others)
# MARRIAGE: Marital status (married, single, others)
# AGE: Age in years
# PAY_1: Repayment status in September, 2005 (0=pay duly, 1=payment delay for one month, 2=payment delay for two months, ... 8=payment delay for eight months, 9=payment delay for nine months and above)
# PAY_2: Repayment status in August, 2005 (scale same as above)
# PAY_3: Repayment status in July, 2005 (scale same as above)
# PAY_4: Repayment status in June, 2005 (scale same as above)
# PAY_5: Repayment status in May, 2005 (scale same as above)
# PAY_6: Repayment status in April, 2005 (scale same as above)
# BILL_AMT1: Amount of bill statement in September, 2005 (NT dollar)
# BILL_AMT2: Amount of bill statement in August, 2005 (NT dollar)
# BILL_AMT3: Amount of bill statement in July, 2005 (NT dollar)
# BILL_AMT4: Amount of bill statement in June, 2005 (NT dollar)
# BILL_AMT5: Amount of bill statement in May, 2005 (NT dollar)
# BILL_AMT6: Amount of bill statement in April, 2005 (NT dollar)
# PAY_AMT1: Amount of previous payment in September, 2005 (NT dollar)
# PAY_AMT2: Amount of previous payment in August, 2005 (NT dollar)
# PAY_AMT3: Amount of previous payment in July, 2005 (NT dollar)
# PAY_AMT4: Amount of previous payment in June, 2005 (NT dollar)
# PAY_AMT5: Amount of previous payment in May, 2005 (NT dollar)
# PAY_AMT6: Amount of previous payment in April, 2005 (NT dollar)
# default.payment.next.month: Default payment (yes,no)

Scatter plots

Scatter Plots are a very useful way of visualizing the inverse and direct relationships between two variables. In a direct relationship between two quantities, an increase/decrease in one quantity leads to a corresponding increase/decrease in the other quantity, whereas in an inverse relationship, an increase/decrease in one quantity leads to a corresponding increase/decrease in the other quantity.

However, in real ...