Menu
Amazon Redshift
Database Developer Guide (API Version 2012-12-01)

POSIX Operators

POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. POSIX regular expression patterns can match any portion of a string, unlike the SIMILAR TO operator, which returns true only if its pattern matches the entire string.

Note

Regular expression matching using POSIX operators is computationally expensive. We recommend using LIKE whenever possible, especially when processing a very large number of rows. For example, the following queries are functionally identical, but the query that uses LIKE executes several times faster than the query that uses a regular expression:

Copy
select count(*) from event where eventname ~ '.*(Ring|Die).* '; select count(*) from event where eventname LIKE '%Ring%' OR eventname LIKE '%Die%';

Syntax

Copy
expression [ ! ] ~ pattern

Arguments

expression

A valid UTF-8 character expression, such as a column name.

!

Negation operator.

~

Perform a case-sensitive match for any substring of expression.

pattern

A string literal that represents a SQL standard regular expression pattern.

If pattern does not contain wildcard characters, then the pattern only represents the string itself.

To search for strings that include metacharacters, such as ‘. * | ? ‘, and so on, escape the character using two backslashes (' \\ '). Unlike SIMILAR TO and LIKE, POSIX regular expression syntax does not support a user-defined escape character.

Either of the character expressions can be CHAR or VARCHAR data types. If they differ, Amazon Redshift converts pattern to the data type of expression.

All of the character expressions can be CHAR or VARCHAR data types. If the expressions differ in data type, Amazon Redshift converts them to the data type of expression.

POSIX pattern matching supports the following metacharacters:

POSIX Description
. Matches any single character.
* Matches zero or more occurrences.
+ Matches one or more occurrences.
? Matches zero or one occurrence.
| Specifies alternative matches; for example, E | H means E or H.
^ Matches the beginning-of-line character.
$ Matches the end-of-line character.
$ Matches the end of the string.
[ ] Brackets specify a matching list, that should match one expression in the list. A caret (^) precedes a nonmatching list, which matches any character except for the expressions represented in the list.
( ) Parentheses group items into a single logical item.
{m} Repeat the previous item exactly m times.
{m,} Repeat the previous item m or more times.
{m,n} Repeat the previous item at least m and not more than n times.
[: :] Matches any character within a POSIX character class. In the following character classes, Amazon Redshift supports only ASCII characters: [:alnum:], [:alpha:], [:lower:], [:upper:]

Amazon Redshift supports the following POSIX character classes.

Character Class Description
[[:alnum:]] All ASCII alphanumeric characters
[[:alpha:]] All ASCII alphabetic characters
[[:blank:]] All blank space characters
[[:cntrl:]] All control characters (nonprinting)
[[:digit:]] All numeric digits
[[:lower:]] All lowercase ASCII alphabetic characters
[[:punct:]] All punctuation characters
[[:space:]] All space characters (nonprinting)
[[:upper:]] All uppercase ASCII alphabetic characters
[[:xdigit:]] All valid hexadecimal characters

Amazon Redshift supports the following Perl-influenced operators in regular expressions. Escape the operator using two backslashes (‘\\’).  

Operator Description Equivalent character class expression
\\d A digit character [[:digit:]]
\\D A nondigit character [^[:digit:]]
\\w A word character [[:word:]]
\\W A nonword character [^[:word:]]
\\s A white space character [[:space:]]
\\S A non–white space character [^[:space:]]

Examples

The following table shows examples of pattern matching using POSIX operators:

Expression Returns
'abc' ~ 'abc' True
'abc' ~ 'a' True
'abc' ~ 'A' False
'abc' ~ '.*(b|d).*' True
'abc' ~ '(b|c).*' True
'AbcAbcdefgefg12efgefg12' ~ '((Ab)?c)+d((efg)+(12))+' True
'aaaaaab11111xy' ~ 'a{6}.[1]{5}(x|y){2}' True
'$0.87' ~ '\\$[0-9]+(\\.[0-9][0-9])?' True
'ab c' ~ '[[:space:]]' True
'ab c' ~ '\\s' True
' ' ~ '\\S' False

The following example finds all cities whose names contain E or H:

Copy
select distinct city from users where city ~ '.*E.*|.*H.*' order by city; city ----------------------- Agoura Hills Auburn Hills Benton Harbor Beverly Hills Chicago Heights Chino Hills Citrus Heights East Hartford

The following example uses the escape string ('\\') to search for strings that include a period.

Copy
select venuename from venue where venuename ~ '.*\\..*'; venuename ----------------------------- Bernard B. Jacobs Theatre E.J. Nutter Center Hubert H. Humphrey Metrodome Jobing.com Arena St. James Theatre St. Pete Times Forum Superpages.com Center U.S. Cellular Field

On this page: