Data Preparation

Learn how to load and prepare the dataset for training the ML model and apply hyperparameter techniques.

How to prepare the dataset

In this lesson, we’ll explore how to load the dataset and prepare it for training an ML model using its default hyperparameters. We’ll then apply the hyperparameter techniques to improve performance.

What will we learn?

We’ll learn to:

  • Load the dataset.

  • Clean the dataset.

  • Perform feature engineering techniques to preprocess the dataset.

Import important packages

First, we’ll import important Python packages that will do the following tasks:

  • Load the dataset.

  • Clean the dataset.

  • Process the dataset using feature engineering techniques.

Press + to interact
# import important modules
import numpy as np
import pandas as pd
# sklearn modules
from sklearn.preprocessing import MinMaxScaler
import warnings
warnings.filterwarnings("ignore")
# seeding
np.random.seed(123)

Load the dataset

We’ll use pandas to load the dataset from the data folder. The name of the dataset is loan_data.csv.

Press + to interact
# load data
data_path = "loan_data.csv"
data = pd.read_csv(data_path)

Let’s see the first five rows of the dataset using the head() method ...