SQL Server has historically had limited pattern matching capabilities, relying on LIKE and PATINDEX() for string matching.
SQL Server 2025 introduces native regular expression support, bringing SQL Server in line with other major database platforms like Oracle, PostgreSQL, and MySQL.
Regular expressions unlock powerful text processing capabilities that were previously difficult or impossible in T-SQL alone.
A regular expression (regex) is a sequence of characters that defines a search pattern.
Think of them as a super-powered version of the LIKE operator.
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."
-- Jamie Zawinski
Let's see if that's still true.
LIKE gives us two wildcards:
% -- Zero or more characters_ -- Exactly one characterRegular expressions give us much more control over pattern matching, including character classes, quantifiers, alternation, grouping, anchors, and more.
Use LIKE when simple prefix/suffix/contains matching is enough. It can use indexes and is faster for simple patterns. Use regex when you need character classes, quantifiers, alternation, or complex pattern logic.
The simplest regex pattern is a literal string.
The pattern cat matches the text "cat" inside a string. This is equivalent to LIKE '%cat%'.
Regular expressions are case-sensitive by default, though SQL Server's regex functions include a flag to change this.
Square brackets define a set of characters to match:
[abc] -- Matches a, b, or c[a-z] -- Matches any lowercase letter[0-9] -- Matches any digit[^abc] -- Matches any character except a, b, or c\d -- Any digit (equivalent to [0-9])\D -- Any non-digit\w -- Any word character (letters, digits, underscore)\W -- Any non-word character\s -- Any whitespace character\S -- Any non-whitespace character. -- Any character (except newline by default)Quantifiers specify how many times a character or group should repeat:
* -- Zero or more times+ -- One or more times? -- Zero or one time{n} -- Exactly n times{n,} -- n or more times{n,m} -- Between n and m timesBy default, quantifiers are greedy: they match as much as possible.
Adding ? after a quantifier makes it lazy: it matches as little as possible.
<.+> on <b>bold</b> matches <b>bold</b> (greedy: everything between the first < and the last >)<.+?> on the same string matches <b> (lazy: stops at the first >)This is the #1 source of "my regex matches too much!" bugs.
^ -- Start of string$ -- End of string\b -- Word boundaryFor example, ^[A-Z] matches a string that starts with a capital letter, while [0-9]$ matches a string that ends with a digit.
(abc) -- Captures the group "abc"(?:abc) -- Groups without capturing (non-capturing group)a|b -- Matches a or b (alternation)(cat|dog) -- Matches "cat" or "dog"Captured groups can be referenced in replacement patterns using \1, \2, etc. Use (?:...) when you need grouping but don't need to capture.
If you want to match a character that has special meaning in regex, escape it with a backslash:
\. matches a literal period, \* matches a literal asterisk, \\ matches a literal backslash.
Special characters that need escaping: . * + ? ^ $ { } [ ] ( ) | \
REGEXP_LIKE tests whether a string matches a regular expression pattern. It returns 1 (match) or 0 (no match).
REGEXP_LIKE(
source_string,
pattern
[, flags]
)
Think of this as a more powerful replacement for LIKE or PATINDEX() > 0.
The optional flags parameter modifies matching behavior:
'i' -- Case-insensitive matching'm' -- Multi-line mode (^ and $ match line boundaries)These flags are common to all SQL Server 2025 regex functions.
REGEXP_COUNT returns the number of times a pattern matches within a string.
REGEXP_COUNT(
source_string,
pattern
[, start_position
[, flags]]
)
This is useful for counting occurrences of a pattern, such as the number of words or digits in a string.
REGEXP_REPLACE replaces occurrences of a pattern within a string.
REGEXP_REPLACE(
source_string,
pattern,
replacement
[, start_position
[, occurrence
[, flags]]]
)
This is a major upgrade from REPLACE(), which only handles literal string replacement.
Use captured groups in your replacement string:
\1, \2, etc. refer to captured groupsFor example, (\d{3})-(\d{4}) with replacement \2-\1 swaps the two groups.
These check for patterns without consuming them:
(?=...) -- Positive lookahead: what follows must match(?!...) -- Negative lookahead: what follows must NOT match(?<=...) -- Positive lookbehind: what precedes must match(?<!...) -- Negative lookbehind: what precedes must NOT matchWe will demo the positive versions today. Useful for redacting data, conditional replacements, and context-sensitive matching.
REGEXP_SUBSTR extracts a substring that matches a regular expression pattern.
REGEXP_SUBSTR(
source_string,
pattern
[, start_position
[, occurrence
[, flags
[, group_number]]]]
)
This is especially useful for extracting structured data from unstructured text.
REGEXP_INSTR returns the starting position of a pattern match within a string. Think of it as a regex-powered PATINDEX().
REGEXP_INSTR(
source_string,
pattern
[, start_position
[, occurrence
[, return_option
[, flags
[, group_number]]]]]
)
The return_option parameter controls whether to return the start (0) or end (1) of the match.
SQL Server 2025 uses the ICU regex engine, which supports Unicode property escapes. This is something LIKE and PATINDEX cannot do.
\p{L} -- Any Unicode letter (Latin, Cyrillic, Chinese, Arabic, etc.)\p{N} -- Any Unicode number\p{Sc} -- Any currency symbol ($, €, £, ¥)\p{P} -- Any punctuation characterEssential for working with multilingual or international data.
Now let's put it all together with some real-world scenarios:
Regular expressions are powerful but come with caveats:
WHERE clause might not use indexes. Some require a full scan, just like LIKE '%pattern%'LIKE 'prefix%' is still faster and can use indexesIf you want to use regular expressions in SQL Server before 2025, you would typically rely on CLR functions or third-party libraries.
One outstanding CLR-based library is SQL#, which provides a comprehensive set of regex functions for SQL Server.
The free version includes most Regex functionality and the paid edition adds capture groups.
SQL Server 2025 brings long-awaited regular expression support to the T-SQL language. The five new functions give database developers powerful text processing capabilities without leaving T-SQL.
Key takeaways:
\p{...} for Unicode-aware international data
To learn more, go here:
https://csmore.info/on/regex
And for help, contact me:
feasel@catallaxyservices.com | @feaselkl
Catallaxy Services consulting:
https://CSmore.info/on/contact