...

/

Character and Text

Character and Text

Learn about text processing, pattern matching, and regular expressions in PostgreSQL.

We'll cover the following...

PostgreSQL knows how to deal with characters and text, and it implements several data types for that, all documented in the character types chapter of the documentation.

About the data type itself, it must be noted that text and varchar are the same things as far as PostgreSQL is concerned, and character varying is an alias for varchar. When using varchar(15), we’re basically telling PostgreSQL to manage a text column with a check constraint of 15 characters.

Yes, PostgreSQL knows how to count characters even with Unicode encoding.

Text processing in PostgreSQL

There’s a very rich set of PostgreSQL functions to process the text—we can find them all in the String Functions and Operators documentation chapter—with functions such as overlay(), substring(), position(), or trim(). Or aggregates such as string_agg(). There are also regular expression functions, including the very powerful regexp_split_to_table().

For more about PostgreSQL regular expressions, read the main documentation about them in the chapter on pattern matching.

Pattern matching

Additionally, to the classic like and ilike patterns and to the SQL standard similar to operators, PostgreSQL embeds support for a full-blown regular expression matching engine. The main operator implementing REGEXP is ~, and then we find the derivatives for not matching and match either case. In total, we have four operators: ~, !~, ~*, and !~*.

Regular expressions

The regular expression split functions are powerful in many use cases. In particular, they’re very helpful when we have to work with a messy schema ...