Reading a Text File Word by Word

Let’s learn how to read a text file one word at a time.

Tokenization using regular expressions

Reading a plain text file word by word is the single most useful function that we want to perform on a file because we usually want to process a file on a per-word basis—it is illustrated in this lesson using the code found in byWord.go. The desired functionality is implemented in the wordByWord() function. The wordByWord() function uses regular expressions to separate the words found in each line of the input file. The regular expression defined in the regexp.MustCompile("[^\\s]+") statement states that we use whitespace characters to separate one word from another.

Coding example

The implementation of the wordByWord() function is as follows:

Get hands-on with 1300+ tech skills courses.