Strings and Character Lists

Learn how single-quoted strings work differently in Elixir.

We'll cover the following

Before we get further into this, we need to explain something. In most other languages, we’d call both 'cat' and "cat" strings. And that’s what we’ve been doing so far. But Elixir has a different convention.

In Elixir, the convention is that we call it a string only if it’s double-quoted, like “strings”. The single-quoted form is a character list.

This is important. The single- and double-quoted forms are very different, and libraries that work on strings work only on the double-quoted form.

Let’s explore the differences in more detail.

Single-quoted strings

Single-quoted strings are represented as a list of integer values, with each value corresponding to a codepoint in the string. For this reason, we refer to them as character lists.

iex> str = 'wombat' 
'wombat'

iex> is_list str 
true

iex> length str
6

iex> Enum.reverse str 
'tabmow'

This is confusing: IEx says it’s a list, but it shows the value as a string. That’s because IEx prints a list of integers as a string if it believes each number in the list is a printable character. We can try this for ourselves:

iex> [ 67, 65, 84 ] 
'CAT'

We can look at the internal representation in a number of ways:

iex> str = 'wombat'
'wombat'

iex> :io.format "~w~n", [ str ] [119,111,109,98,97,116]
:ok

iex> List.to_tuple str
{119, 111, 109, 98, 97, 116} 

iex> str ++ [0]
[119, 111, 109, 98, 97, 116, 0]

The ~w in the format string forces str to be written as an Erlang term—the underlying list of integers. The ~n is a newline.

The last example creates a new character list with a null byte at the end. IEx no longer thinks all the bytes are printable and returns the underlying character codes.

If a character list contains characters Erlang considers nonprintable, we’ll see the list representation.

iex> '∂x/∂y'
[8706, 120, 47, 8706, 121]

Because a character list is a list, we can use the usual pattern matching and list functions.

iex> 'pole' ++ 'vault' 
'polevault'

iex> 'pole' -- 'vault' 
'poe'

iex> List.zip [ 'abc', '123' ] 
[{97, 49}, {98, 50}, {99, 51}] 

iex> [ head | tail ] = 'cat' 
'cat'

iex> head 
99

iex> tail 
'at'

iex> [ head | tail ] 
'cat'

Why is the head of 'cat' 99 and not c?. Remember that a character list is just a list of integer character codes, so each individual entry is a number. It happens that 99 is the code for a lowercase c.

Use the terminal below for practice.

Get hands-on with 1400+ tech skills courses.