Characters: Introduction

In this lesson, we see a brief history of characters and different unicode encodings of characters.

Characters #

Characters are building blocks of strings. Any symbol of a writing system is called a character: letters of alphabets, numerals, punctuation marks, the space character, etc. Confusingly, building blocks of characters themselves are called characters as well.

Arrays of characters make up strings. We have seen arrays earlier in this chapter; strings will be covered later in this chapter.

Like any other data, characters are also represented as integer values that are made up of bits. For example, the integer value of the lowercase ‘a’ is 97 and the integer value of the numeral ‘1’ is 49. These values are merely conventions assigned when the ASCII standard was designed.

Character representation #

In many programming languages, characters are represented by the char type, which can hold only 256 distinct values. If you are familiar with the char type from other languages, you may already know that it is not large enough to support the symbols of many writing systems. Before getting into the three distinct character types of D, let’s first take a look at the history of characters in computer systems.

History #

ASCII table #

The ASCII table was designed at a time when computer hardware was very limited compared to modern systems. Having been based on 7 bits, the ASCII table can have 128 distinct code values. That many distinct values are sufficient to represent the lowercase and uppercase versions of the 26 letters of the basic Latin alphabet, numerals, commonly used punctuation marks and some terminal control characters.

As an example, the ASCII codes of the characters of the string “hello” are the following:

Get hands-on with 1400+ tech skills courses.