...

/

Machine Learning and Imbalanced Data

Machine Learning and Imbalanced Data

Learn to deal with the class imbalance problem manually and with SMOTE.

Since we have the features and the targets from our previous lesson, let's split them into train and test datasets.

Imbalance data

Let's also check the class imbalance for our training data.

Press + to interact
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=101
)
print(y_train.value_counts(), '\n',
"Minority class (Active) is only {} % in the training set".format(
round(y_train.value_counts()[1] / len(y_train) * 100, 2)
)
)

With that, let's train a logistic regression model.

Press + to interact
from sklearn.linear_model import LogisticRegression
# Creating model instances
logR = LogisticRegression(max_iter=10000)
# fitting the model
logR.fit(X_train,y_train)
# Accuracy Score
print("Accuracy Score for (X_train, y_train):",logR.score(X_train,y_train))

The numbers look impressive with an accuracy of ~98%. The minority class is only ...

Access this course and 1400+ top-rated courses and projects.