Character and Text
Learn about text processing, pattern matching, and regular expressions in PostgreSQL.
PostgreSQL knows how to deal with characters and text, and it implements several data types for that, all documented in the character types chapter of the documentation.
About the data type itself, it must be noted that text
and varchar
are the same things as far as PostgreSQL is concerned, and character varying is an alias for varchar
. When using varchar(15)
, we’re basically telling PostgreSQL to manage a text
column with a check
constraint of 15 characters.
Yes, PostgreSQL knows how to count characters even with Unicode encoding.
Text processing in PostgreSQL
There’s a very rich set of PostgreSQL functions to process the text—we can find them all in the String Functions and
Operators documentation chapter—with functions such as overlay()
, substring()
, position()
, or trim()
. Or aggregates such as string_agg()
. There are also regular expression functions, including the very powerful regexp_split_to_table()
.
For more about PostgreSQL regular expressions, read the main documentation about them in the chapter on pattern matching.
Pattern matching
Additionally, to the classic like
and ilike
patterns and to the SQL standard similar to
operators, PostgreSQL embeds support for a full-blown regular expression matching engine. The main operator implementing REGEXP is ~
, and then we find the derivatives for not matching and match either case. In total, we have four operators: ~
, !~
, ~*
, and !~*
.
Regular expressions
The regular expression split functions are powerful in many use cases. In particular, they’re very helpful when we have to work with a messy schema in which a ...