Building the Model
Explore how to build a classification model using PyCaret, including training with stratified k-fold cross-validation. Learn to tune hyperparameters to improve model accuracy, make predictions on test data, visualize results with plots, and finalize and save the model for deployment. This lesson provides hands-on guidance for effective classification model development.
Creating the model
We’ll use the create_model() function to train the Linear Discriminant Analysis model because it performed best in the model comparison.
Model
Accuracy | AUC | Recall | Prec. | F1 | Kappa | MCC | |
0 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
1 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
2 | 0.9167 | 1.0000 | 0.9167 | 0.9333 | 0.9153 | 0.8750 | 0.8843 |
3 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
4 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
5 | 0.9167 | 1.0000 | 0.9167 | 0.9333 | 0.9153 | 0.8750 | 0.8843 |
6 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
7 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
8 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
9 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
Mean | 0.9833 | 1.0000 | 0.9833 | 0.9867 | 0.9831 | 0.9750 | 0.9769 |
SD | 0.0333 | 0.0000 | 0.0333 | 0.0267 | 0.0339 | 0.0500 | 0.0463 |
This function uses stratified -fold cross-validation to evaluate model accuracy, a variation of the standard -fold technique used in the Regression chapter. The dataset is consecutively partitioned into subsamples, with one subsample being retained for validation, while the rest are used to train the model. The difference between stratified -fold and standard -fold is that subsamples are stratified to preserve the ...