Quantitative vs. Qualitative Data and Creating Dummies
Learn about numerical vs. categorical data, and create dummies for categorical features in the tips dataset.
We can have quantitative or qualitative variables in the data. So far, we have worked with several datasets with numerical feature variables (X). Therefore, let's explore the numerical (discrete and continuous) vs. categorical (nominal and ordinal) features.
Quantitative data
Quantitative data, also called numerical data, contains numerical variables that can be discrete or continuous.
Discrete
Discrete data can only take certain values, a complete digit or a finite number of possible values:
Students: {10, 20, 30}
Deaths: {1, 5, 6}
Patients: {100, 400, 1000}
We can't have 10.5 students or 1.5 deaths.
Continuous
This type of data can potentially have infinite possible values (digit or float), such as:
Weight: {1, 1.1, 3.5, 3.5555555}
Price: {10, 10.50, 50.25}
Qualitative data
Qualitative data, also called categorical data, contains categorical variables that define some characteristics. Categorical variables can be nominal or ordinal.
Nominal
Nominals are the unordered lists of categories, such as:
Animal: {cat, dog}
Time: {dinner, lunch}
Blood_group: {A, B, AB, O} ...