REGEXP_INSTR function
Searches a string for a regular expression pattern and returns an integer that
indicates the beginning position or ending position of the matched substring. If no
match is found, then the function returns 0
. REGEXP_INSTR is similar to the POSITION function, but lets you search a string for a
regular expression pattern. For more information about regular expressions, see POSIX operators and
Regular expression
Syntax
REGEXP_INSTR( source_string, pattern [, position [, occurrence] [, option [, parameters ] ] ] ] )
Arguments
- source_string
-
A string expression, such as a column name, to be searched.
- pattern
-
A UTF-8 string literal that represents a regular expression pattern. For more information, see POSIX operators.
- position
-
(Optional) A positive
INTEGER
that indicates the position within source_string to begin searching. The position is based on the number of characters, not bytes, so that multibyte characters are counted as single characters. The default is1
. If position is less than1
, the search begins at the first character of source_string. If position is greater than the number of characters in source_string, the result is0
. - occurrence
-
(Optional) A positive
INTEGER
that indicates which occurrence of the pattern to use. REGEXP_INSTR skips the firstoccurrence-1
matches. The default is1
. If occurrence is less than1
or greater than the number of characters in source_string, the search is ignored and the result is0
. - option
-
(Optional) A value that indicates whether to return the position of the first character of the match (
0
) or the position of the first character following the end of the match (1
). A nonzero value is the same as1
. The default value is0
. - parameters
-
(Optional) One or more string literals that indicate how the function matches the pattern. The possible values are the following:
-
c – Perform case-sensitive matching. The default is to use case-sensitive matching.
-
i – Perform case-insensitive matching.
-
e – Extract a substring using a subexpression.
If pattern includes a subexpression, REGEXP_INSTR matches a substring using the first subexpression in pattern. REGEXP_INSTR considers only the first subexpression; additional subexpressions are ignored. If the pattern doesn't have a subexpression, REGEXP_INSTR ignores the 'e' parameter.
-
p – Interpret the pattern with Perl Compatible Regular Expression (PCRE) dialect. For more information about PCRE, see Perl Compatible Regular Expressions
in Wikipedia.
-
Return type
Integer
Examples
The following examples use data from the USERS table in the TICKIT sample database. For more information, see Sample database.
To search for the @
character that begins a
domain name and returns the starting position of the first match, use the following example.
SELECT email, REGEXP_INSTR(email, '@[^.]*') FROM users ORDER BY userid LIMIT 4;
+-----------------------------------------------+--------------+ | email | regexp_instr | +-----------------------------------------------+--------------+ | Etiam.laoreet.libero@sodalesMaurisblandit.edu | 21 | | Suspendisse.tristique@nonnisiAenean.edu | 22 | | amet.faucibus.ut@condimentumegetvolutpat.ca | 17 | | sed@lacusUtnec.ca | 4 | +-----------------------------------------------+--------------+
To search for variants of the word Center
and
returns the starting position of the first match, use the following example.
SELECT venuename, REGEXP_INSTR(venuename,'[cC]ent(er|re)$') FROM venue WHERE REGEXP_INSTR(venuename,'[cC]ent(er|re)$') > 0 ORDER BY venueid LIMIT 4;
+-----------------------+--------------+ | venuename | regexp_instr | +-----------------------+--------------+ | The Home Depot Center | 16 | | Izod Center | 6 | | Wachovia Center | 10 | | Air Canada Centre | 12 | +-----------------------+--------------+
To find the starting position of the first occurrence of the
string FOX
, using case-insensitive matching logic, use the following example.
SELECT REGEXP_INSTR('the fox', 'FOX', 1, 1, 0, 'i');
+--------------+ | regexp_instr | +--------------+ | 5 | +--------------+
To use a pattern written in PCRE dialect to locate words
containing at least one number and one lowercase letter, use the following example. It uses the ?=
operator,
which has a specific look-ahead connotation in PCRE. This example finds the
starting position of the second such word.
SELECT REGEXP_INSTR('passwd7 plain A1234 a1234', '(?=[^ ]*[a-z])(?=[^ ]*[0-9])[^ ]+', 1, 2, 0, 'p');
+--------------+ | regexp_instr | +--------------+ | 21 | +--------------+
To use a pattern written in PCRE dialect to locate words
containing at least one number and one lowercase letter, use the following example. It uses
the ?=
operator, which has a specific look-ahead connotation in
PCRE. This example finds the starting position of the second such word, but differs from the previous example in that it uses case-insensitive matching.
SELECT REGEXP_INSTR('passwd7 plain A1234 a1234', '(?=[^ ]*[a-z])(?=[^ ]*[0-9])[^ ]+', 1, 2, 0, 'ip');
+--------------+ | regexp_instr | +--------------+ | 15 | +--------------+