Scikit-Learn for Machine Learning/

...

Solution: Model Evaluation

Follow the instructions to perform model evaluation on real-world data.

We'll cover the following...

Press + to interact

Python 3.8

Files

import numpy as np
import pandas as pd
preprocessed = pd.read_csv("preprocessed.csv")
# Define X (model features) and y (target variable)
X = preprocessed[X_var]
y = preprocessed[y_var]
# Three algorithms
classifiers = [
    LogisticRegression(penalty='l2', C=10),
    KNeighborsClassifier(
        n_neighbors=4, metric='euclidean', weights='distance'
    ),
    DecisionTreeClassifier(
    max_depth=5, min_samples_split=10
    )
]
# Import evaluation metric
from sklearn.metrics import f1_score
# Initialize k-fold cross-validation
from sklearn.model_selection import KFold
k = 3
kf = KFold(n_splits=k)
# Perform k-fold cross-validation for each model
for model in classifiers:
    # Initialize a list to store the F1 scores for each fold
    f1_scores = []
    for train_index, test_index in kf.split(X):
        X_train, X_test = X.iloc[train_index], X.iloc[test_index]
        y_train, y_test = y.iloc[train_index], y.iloc[test_index]
        
        # Train the model
        model.fit(X_train, y_train)
        
        # Calculate F1 score for the current fold
        y_test_pred = model.predict(X_test)
        f1_scores.append(f1_score(y_test, y_test_pred))
    print(f"Average F1 Score for {type(model).__name__}:", np.mean(f1_scores))

Course Overview

Introduction to Machine Learning

Preprocessing

Supervised Learning

Unsupervised Learning

Model Evaluation

How to Predict the Traffic Volume Using Machine Learning

Tips and Tricks

Conclusion

Customer Segmentation with K-Means Clustering

Solution: Model Evaluation