REGEXP_INSTR function - Amazon Redshift

REGEXP_INSTR function

Searches a string for a regular expression pattern and returns an integer that indicates the beginning position or ending position of the matched substring. If no match is found, then the function returns 0. REGEXP_INSTR is similar to the POSITION function, but lets you search a string for a regular expression pattern. For more information about regular expressions, see POSIX operators and Regular expression in Wikipedia.

Syntax


REGEXP_INSTR( source_string, pattern [, position [, occurrence] [, option [, parameters ] ] ] ] )

Arguments

source_string

A string expression, such as a column name, to be searched.

pattern

A UTF-8 string literal that represents a regular expression pattern. For more information, see POSIX operators.

position

(Optional) A positive INTEGER that indicates the position within source_string to begin searching. The position is based on the number of characters, not bytes, so that multibyte characters are counted as single characters. The default is 1. If position is less than 1, the search begins at the first character of source_string. If position is greater than the number of characters in source_string, the result is 0.

occurrence

(Optional) A positive INTEGER that indicates which occurrence of the pattern to use. REGEXP_INSTR skips the first occurrence-1 matches. The default is 1. If occurrence is less than 1 or greater than the number of characters in source_string, the search is ignored and the result is 0.

option

(Optional) A value that indicates whether to return the position of the first character of the match (0) or the position of the first character following the end of the match (1). A nonzero value is the same as 1. The default value is 0.

parameters

(Optional) One or more string literals that indicate how the function matches the pattern. The possible values are the following:

c – Perform case-sensitive matching. The default is to use case-sensitive matching.
i – Perform case-insensitive matching.
e – Extract a substring using a subexpression.

If pattern includes a subexpression, REGEXP_INSTR matches a substring using the first subexpression in pattern. REGEXP_INSTR considers only the first subexpression; additional subexpressions are ignored. If the pattern doesn't have a subexpression, REGEXP_INSTR ignores the 'e' parameter.
p – Interpret the pattern with Perl Compatible Regular Expression (PCRE) dialect. For more information about PCRE, see Perl Compatible Regular Expressions in Wikipedia.

Return type

Integer

Examples

The following examples use data from the USERS table in the TICKIT sample database. For more information, see Sample database.

To search for the @ character that begins a domain name and returns the starting position of the first match, use the following example.


SELECT email, REGEXP_INSTR(email, '@[^.]*')
FROM users
ORDER BY userid LIMIT 4;

+-----------------------------------------------+--------------+
|                     email                     | regexp_instr |
+-----------------------------------------------+--------------+
| Etiam.laoreet.libero@sodalesMaurisblandit.edu |           21 |
| Suspendisse.tristique@nonnisiAenean.edu       |           22 |
| amet.faucibus.ut@condimentumegetvolutpat.ca   |           17 |
| sed@lacusUtnec.ca                             |            4 |
+-----------------------------------------------+--------------+

To search for variants of the word Center and returns the starting position of the first match, use the following example.


SELECT venuename, REGEXP_INSTR(venuename,'[cC]ent(er|re)$')
FROM venue
WHERE REGEXP_INSTR(venuename,'[cC]ent(er|re)$') > 0
ORDER BY venueid LIMIT 4;

+-----------------------+--------------+
|       venuename       | regexp_instr |
+-----------------------+--------------+
| The Home Depot Center |           16 |
| Izod Center           |            6 |
| Wachovia Center       |           10 |
| Air Canada Centre     |           12 |
+-----------------------+--------------+

To find the starting position of the first occurrence of the string FOX, using case-insensitive matching logic, use the following example.


SELECT REGEXP_INSTR('the fox', 'FOX', 1, 1, 0, 'i');

+--------------+
| regexp_instr |
+--------------+
|            5 |
+--------------+

To use a pattern written in PCRE dialect to locate words containing at least one number and one lowercase letter, use the following example. It uses the ?= operator, which has a specific look-ahead connotation in PCRE. This example finds the starting position of the second such word.


SELECT REGEXP_INSTR('passwd7 plain A1234 a1234', '(?=[^ ]*[a-z])(?=[^ ]*[0-9])[^ ]+', 1, 2, 0, 'p');

+--------------+
| regexp_instr |
+--------------+
|           21 |
+--------------+


SELECT REGEXP_INSTR('passwd7 plain A1234 a1234', '(?=[^ ]*[a-z])(?=[^ ]*[0-9])[^ ]+', 1, 2, 0, 'ip');

+--------------+
| regexp_instr |
+--------------+
|           15 |
+--------------+

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

REGEXP_COUNT

REGEXP_REPLACE

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

REGEXP_INSTR function

Syntax

Arguments

Return type

Examples

Did this page help you?

Next topic:

Previous topic:

Need help?