Samples and Their States

Learn about the Sample class' states.

Why do we need to track the state?

The diagram in the previous lesson shows the Sample class and an extension, the KnownSample class. This doesn’t seem to be a complete decomposition of the various kinds of samples. When we review the user stories and the process views, there seems to be a gap: specifically, the “make classification request” by a User requires an unknown sample. This has the same flower measurements attributes as a Sample, but doesn’t have the assigned species attribute of a KnownSample. Further, there’s no state change that adds an attribute value. A Botanist will formally classify the unknown sample; our algorithm will classify it, but it’s only an AI, not a Botanist.

Subclasses

We can make a case for two distinct subclasses of Sample:

  • UnknownSample: This class contains the initial four Sample attributes. A User provides these objects to get them classified.

  • KnownSample: This class has the Sample attributes plus the classification result, a species name. We use these for training and testing the model.

Class state’s concerns

Generally, we consider class definitions as a way to encapsulate state and behavior. An UnknownSample instance provided by a User starts out with no species. Then, after the classifier algorithm computes a species, the Sample changes state to have a species assigned by the algorithm. A question we must always ask about class definitions is this:

  1. Is there any change in behavior that goes with the change in state?

In this case, it doesn’t seem like there’s anything new or different that can happen. Perhaps this can be implemented as a single class with some optional attributes.

We have another possible state change concern. Currently, there’s no class that owns the responsibility of partitioning Sample objects into the training or testing subsets. This, too, is a kind of state change. This leads to a second important question:

  1. What class has the responsibility for making this state change?

In this case, it seems like the TrainingData class should own the discrimination between testing and training data.

One way to help look closely at our class design is to enumerate all of the various states of individual samples. This technique helps uncover a need for attributes in the classes. It also helps to identify the methods to make state changes to objects of a class.

Get hands-on with 1400+ tech skills courses.