Building the Model
In clustering tasks, learn to create the machine learning model. Further, this lesson covers plotting elbow and PCA plots for the model and saving the machine learning model.
We'll cover the following
Creating a model
The create_model()
function lets us easily create and evaluate the clustering model of our preference such as the k-means algorithm. This function creates clusters by default. We can set the num_clusters
parameter to because this is the correct number. Instead of doing that, however, we’ll follow an approach that generalizes for real-world datasets where the cluster number is typically unknown. After executing the function, we print several performance metrics such as silhouette, Calinski-Harabasz, and Davies-Bouldin. We’ll focus on the silhouette coefficient defined in the following equation.
- is the silhouette coefficient of the dataset instance .
- is the mean intra-cluster distance of .
- is the mean nearest-cluster distance of .
Get hands-on with 1200+ tech skills courses.