Exercise: Counting Unicode Characters
Test your coding skills on counting Unicode characters.
We'll cover the following
Challenge
In this exercise, your challenge is to expand the count
function to count Unicode characters.
Problem statement
In addition to counting lines and words, your tool can also count the number of Unicode characters provided with the input.
Computers encode text using different standards. Text encoded in ASCII uses one byte per character. Therefore, to count characters for text encoded with this standard, it’s usually enough to count the number of bytes. This is useful for languages that support this encoding, such as English, but it might not be enough to correctly count the number of characters for languages encoded using Unicode standard, such as Japanese, because it might use more than one byte per character.
Go supports the Rune
data type to represent Unicode characters (or code points). Expand the program to count runes
in addition to words and lines.
Coding challenge
Take some time to figure out the smartest way to solve this problem. Start from the implementation of the count
function at the end of this chapter. Expand the count
function to receive a new boolean parameter named countRunes
. If this parameter is set to true
, the function should return the number of runes
in the provided input text.
If you feel stuck, refer to Go’s documentation for runes or for the bufio
package. If you still need help, check the solution review in the next lesson. Good luck!
Note: If you’re looking for an extra challenge, write a function for counting runes in addition to Unicode characters.
Create a free account to view this lesson.
By signing up, you agree to Educative's Terms of Service and Privacy Policy