Data Preparation
Learn how to load and prepare the dataset for training the ML model and apply hyperparameter techniques.
How to prepare the dataset
In this lesson, we’ll explore how to load the dataset and prepare it for training an ML model using its default hyperparameters. We’ll then apply the hyperparameter techniques to improve performance.
What will we learn?
We’ll learn to:
Load the dataset.
Clean the dataset.
Perform feature engineering techniques to preprocess the dataset.
Import important packages
First, we’ll import important Python packages that will do the following tasks:
Load the dataset.
Clean the dataset.
Process the dataset using feature engineering techniques.
# import important modulesimport numpy as npimport pandas as pd# sklearn modulesfrom sklearn.preprocessing import MinMaxScalerimport warningswarnings.filterwarnings("ignore")# seedingnp.random.seed(123)
Load the dataset
We’ll use pandas to load the dataset from the data folder. The name of the dataset is loan_data.csv
.
# load datadata_path = "loan_data.csv"data = pd.read_csv(data_path)
Let’s see the first five rows of the dataset using the head()
method ...