Case Study

Learn about extending Python's list class with the SamplePartition class, using overloads and TypedDict for data partitioning.

We’ll refine our case study in this chapter. Previously, in the “Objects in Python” chapter, we talked in a vague way about loading the training data and splitting it into two clumps—the training set and the testing set. In the “When to Use Object-Oriented Programming” chapter, we looked at ways to deserialize the source file into Sample instances.

In this chapter, we want to look further at this operation of using the raw data to create a number of TrainingKnownSample instances separate from a number of TestingKnownSample instances. In the previous chapter, we identified four cases for sample objects, shown in the following table:

Sample Cases


Known

Unknown

Unclassified

Training data

Sample waiting to be classified

Classified

Testing data

Classified sample

Data classification

When looking at the known samples, classified by a botanist, we need to split the data into two separate classes. ...