You are viewing an older version of this section. View current production version.
REGEXP_SUBSTR
Searches a string for a regular expression pattern and returns the matching substring.
REGEXP_SUBSTR
supports the extended and advanced regular expression formats. You can set the global variable regexp_format
to specify which format you want to use.
Syntax
REGEXP_SUBSTR ( source_string, pattern [, position [, occurrence [, parameters ] ] ] )
Arguments
source_string
A string expression, such as a column name, to be searched.
pattern
A string literal representing a regular expression.
position
Optional if occurrence
and parameters
are not specified.
A positive integer indicating the position within source_string
to begin searching. The position is based on the number of characters, not bytes, so that multibyte characters are counted as single characters. The default is 1
. If position
is less than 1
, the search begins at the first character of source_string
. If position
is greater than the number of characters in source_string
, the result is NULL
.
occurrence
Optional if parameters
is not specified.
A positive integer indicating which occurrence of pattern
in source_string
to return. The search for the occurrence begins at position
. If occurrence
is not present, defaults to 1
. If occurrence
is less than 1
or greater than the number of characters in source_string
, the search is ignored and REGEXP_SUBSTR
returns NULL
.
parameters
Optional. If present, has one of the following values:
c
: Perform case-sensitive matching.
i
: Perform case-insensitive matching.
If c
or i
is not specified, performs case-sensitive matching.
Remarks
REGEXP_SUBSTR
returns NULL
if no match is found, source_string
is NULL
, or pattern
is NULL
.
Examples
The following examples demonstrate how REGEXP_SUBSTR
works when the global variable regexp_format
is set to 'extended'
.
Call to REGEXP_SUBSTR |
Result | Comments |
---|---|---|
SELECT REGEXP_SUBSTR('adog','o'); |
o |
Returns the substring o . |
SELECT REGEXP_SUBSTR('adog','.o'); |
do |
Returns the character preceding the first occurrence of ‘o’, followed by the substring ‘o’. |
SELECT REGEXP_SUBSTR('dog58','[[:digit:]]'); |
5 |
Returns the first digit. |
SELECT REGEXP_SUBSTR('1dogbat','[[:alpha:]]'); |
d |
Returns the first alphabet. |
SELECT REGEXP_SUBSTR('rat','b|c'); |
NULL |
Looks for b or c . Since neither character is found, returns NULL . |
SELECT REGEXP_SUBSTR('mousecat','cat$'); |
cat |
Returns the substring cat that appears at the end of the string. |
SELECT REGEXP_SUBSTR('2dogcat','^dog'); |
NULL |
Searches for dog at the beginning of the string. Since dog is not found at the beginning of the string, returns NULL . |
SELECT REGEXP_SUBSTR('catcat','ca.',1); |
cat |
Returns the first occurrence of ca followed by any character. |
SELECT REGEXP_SUBSTR('catcat','cat',1,2); |
cat |
Returns the second occurrence of cat . |
SELECT REGEXP_SUBSTR('bat3BaT','BaT',1,1,'c'); |
BaT |
Returns the first occurrence of BaT (case sensitive search). |
SELECT REGEXP_SUBSTR('bat3BaT','BaT',1,1,'i'); |
bat |
Returns the first occurrence of BaT (case-insensitive search). |
The following examples demonstrate how REGEXP_SUBSTR
works when the global variable regexp_format
is set to 'advanced'
.
Call to REGEXP_SUBSTR |
Result | Comments |
---|---|---|
SELECT REGEXP_SUBSTR('dog58','\\d'); |
5 |
Returns the first digit. |
SELECT REGEXP_SUBSTR('dog58','\\D'); |
d |
Returns the first non-digit. |
SELECT REGEXP_SUBSTR('%26dog','\\w'); |
2 |
Returns the first alpha-numeric character. |
SELECT REGEXP_SUBSTR('%26dog','\\W'); |
% |
Returns the first non-alpha-numeric character. |
SELECT REGEXP_SUBSTR('cat dog','\\mdog'); |
dog |
Returns the first occurrence of the word dog . |
Using REGEXP_SUBSTR
With a Table
The following example shows how to call REGEXP_SUBSTR
once per table row. This example uses extended regular expression matching. Advanced regular expression matching could be used as well.
SET GLOBAL regexp_format = 'extended';
DROP DATABASE IF EXISTS memsql_docs_example;
CREATE DATABASE memsql_docs_example;
use memsql_docs_example;
CREATE TABLE example_table(id INT, text_field TEXT);
INSERT INTO example_table VALUES (1, 'the'), (2, 'athe'), (3, 'abthe'),
(4, 'abcthe'), (5, 'thethe');
SELECT id, text_field, REGEXP_SUBSTR(text_field,'..the') FROM example_table
ORDER BY id;
****
+------+------------+-----------------------------------+
| id | text_field | REGEXP_SUBSTR(text_field,'..the') |
+------+------------+-----------------------------------+
| 1 | the | NULL |
| 2 | athe | NULL |
| 3 | abthe | abthe |
| 4 | abcthe | bcthe |
| 5 | thethe | hethe |
+------+------------+-----------------------------------+