Project Creation: Part Two
In this lesson, we will discuss sampling and build our Markov model.
We'll cover the following...
Load the dataset
Now’s the time to work with our real corpus. Click the download button below to get the dataset. This dataset contains the speech of the Honorable Prime Minister of India in English.
Press + to interact
text_path = "train_corpus.txt"def load_text(filename):with open(filename,encoding='utf8') as f:return f.read().lower()text = load_text(text_path)print('Loaded the dataset.')
Understand sampling
Before moving forward, one more important concept needs to be addressed: sampling. In simple words, sampling is the action or process of taking samples of something for analysis. Let’s understand sampling with the help of an example. Run the ...