Building the Model
In anomaly detection tasks, learn to create, evaluate, plot, save the machine learning model and get information about inliers and outliers.
Creating and assigning the model
We use the create_model()
function to train the local outlier factor model on the Wholesale Customers dataset. After that, we assign anomaly labels and scores to the dataset using the assign_model()
function.
# Creating and assigning the modelmodel = create_model('lof', fraction = 0.05)data_assigned = assign_model(model)data_assigned.head(10)
Channel | Region | Fresh | Milk | Grocery | Frozen | Detergents_Paper | Delicassen | Anomaly | Anomaly_Score | |
0 | Retail | Other | 12669 | 9656 | 7561 | 214 | 2674 | 1338 | 0 | 1.107687 |
1 | Retail | Other | 7057 | 9810 | 9568 | 1762 | 3293 | 1776 | 0 | 1.027102 |
2 | Retail | Other | 6353 | 8808 | 7684 | 2405 | 3516 | 7844 | 0 | 1.398439 |
3 | Horeca | Other | 13265 | 1196 | 4221 | 6404 | 507 | 1788 | 0 | 1.200384 |
4 | Retail | Other | 22615 | 5410 | 7198 | 3915 | 1777 | 5185 | 0 | 1.164052 |
5 | Retail | Other | 9413 | 8259 | 5126 | 666 | 1795 | 1451 | 0 | 1.184313 |
6 | Retail | Other | 12126 | 3199 | 6975 | 480 | 3140 | 545 | 0 | 1.130491 |
7 | Retail | Other | 7579 | 4956 | 9426 | 1669 | 3321 | 2566 | 0 | 1.013751 |
8 | Horeca | Other | 5963 | 3648 | 6192 | 425 | 1716 | 750 | 0 | 1.201904 |
9 | Retail | Other | 6006 | 11093 | 1881 | 1159 | 7425 | 2098 | 0 | 1.053333 |
Two columns that contain the anomaly label and score for each instance are added to the dataset. Instances that are flagged as inliers (anomaly = ) have an anomaly score close to ...