Data Splits Using the Slicing API
Use the slicing API from the TF framework to split a given dataset into training, test, and validation sets.
DL algorithms require large datasets to train models. Once the model is trained, we have to find its performance on unseen examples to assess its generalization ability. To this end, we have to split our dataset into various partitions. This lesson presents common dataset partitions and uses TensorFlow Datasets (TFDS) to demonstrate dataset splits using the slicing API of the TF framework.
Common dataset splits
It’s common practice to split a dataset into three partitions for training, validating, and testing a DL model. The following figure presents three partitions of a full dataset. The greater length of the training set indicates that the number of training examples is greater than the examples in the other two partitions.
Get hands-on with 1400+ tech skills courses.