Case Study
Learn how to optimize k-nearest neighbors classification using Strategy and Command patterns for varied distance computations and performance tuning.
We'll cover the following...
Designing distance computations
We talked about the various ways to compute distances, but left part of the design to be filled in later. Now that we’ve seen some of the basic design patterns, we can apply some of them to our evolving case study.
Specifically, we need to put the various kinds of distance computations into the Hyperparameter
class definition. In previous lessons, we introduced the idea that the distance computation is not a single definition. There are over commonly used distance computation alternatives, some simple, some rather complex. In previous lessons, we showed a few common ones, including Euclidean distance, Manhattan distance, Chebyshev distance, and even a complex-looking Sorensen distance. Each weights the nearness of the neighbors slightly differently.
This leads us to look at the Hyperparameter
class as containing three important components:
- A reference to the base
TrainingData
. This is used to find all of the neighbors, from which the nearest are selected. - The value used to determine how many neighbors will be checked.
- The distance algorithm. We’d like to be able to plug in any algorithm here. Our research revealed a large number of competing choices. This suggests that implementing one or two won’t be very adaptable to real-world demands.
Plugging in the distance algorithm is a good application of the Strategy design pattern. For a given ...