Understanding the Data
Learn about the data and datasets.
About the dataset
First, we need to understand what the dataset looks like so that when we see the generated text, we can assess whether it makes sense, given the training data. We’ll download the first 100 books from “Grimms’ Fairy Tales.” These are translations of a set of books (from German to English) by the Grimm brothers.
Initially, we’ll download all 209 books from the website with an automated script as follows:
Get hands-on with 1400+ tech skills courses.