Advanced Cross-Validation
Learn more advanced methods of cross-validation.
Advanced cross-validation techniques, such as k-fold and leave-one-out, provide more robust and accurate assessments of model performance in ML. These methods go beyond the basic train-test split and allow for a more comprehensive evaluation of model generalization.
The k-fold cross-validation technique
The k-fold cross-validation technique involves dividing the original dataset into k equally sized subsets or folds. The model is trained and evaluated k times, each time using a different fold as the test set and the remaining folds as the training set. The performance metrics obtained from each fold are then averaged to obtain an overall assessment of the model’s performance.
For example, let’s consider using a 5-fold cross-validation with scikit-learn:
import numpy as npfrom sklearn.model_selection import KFoldfrom sklearn.linear_model import Ridgefrom sklearn.metrics import r2_score# Generate synthetic datanp.random.seed(42)X = np.random.rand(1000, 10) # Independent variablesimportant_features = [0, 1, 2, 3] # Indices of important featuresy = np.sum(X[:, important_features], axis=1) + 0.5 * np.random.randn(1000) # Dependent variable# Initialize k-fold cross-validationk = 5kf = KFold(n_splits=k)# Initialize a list to store the R2 scores for each foldr2_scores = []# Perform k-fold cross-validationfor train_index, test_index in kf.split(X):X_train, X_test = X[train_index], X[test_index]y_train, y_test = y[train_index], y[test_index]# Train the Ridge regression modelmodel = Ridge(alpha=0) # Alpha controls regularization strengthmodel.fit(X_train, y_train)# Calculate R2 score for the current foldy_test_pred = model.predict(X_test)r2_scores.append(r2_score(y_test, y_test_pred))# Print the R2 scores for each fold and their averagefor i, score in enumerate(r2_scores):print(f"R2 Score - Fold {i+1}: {score}")print("Average R2 Score:", np.mean(r2_scores))
Lines 13–14: We initialize a 5-fold cross-validation.
Lines 20–30: We iterate over the splits so that each time we fit our model to a different training set and evaluate it on a different test set. We then store the evaluation metrics under
r2_scores
.
In this example, the dataset is split into five folds. The model is trained and evaluated five times, with each fold serving as the test set once. The