Characters: Introduction
In this lesson, we see a brief history of characters and different unicode encodings of characters.
We'll cover the following...
Characters #
Characters are building blocks of strings. Any symbol of a writing system is called a character: letters of alphabets, numerals, punctuation marks, the space character, etc. Confusingly, building blocks of characters themselves are called characters as well.
Arrays of characters make up strings. We have seen arrays earlier in this chapter; strings will be covered later in this chapter.
Like any other data, characters are also represented as integer values that are made up of bits. For example, the integer value of the lowercase ‘a’ is 97 and the integer value of the numeral ‘1’ is 49. These values are merely conventions assigned when the ASCII standard was designed.
Character representation #
In many programming languages, characters are represented by the char type, which can hold only 256 distinct values. If you are familiar with the char
type from other languages, you may already know that it is not large enough to support the symbols of many writing systems. Before getting into the three distinct character types of ...