REGEX atomic groups `(?>)`
We'll cover the following
By definition, a REGEX atomic group is a non-capturing group that exits the group and throws away all alternatives after the first match of the pattern inside the group, so backtracking is disallowed".
While a non-atomic group will allow backtracking, it will still find the first match, then if the matching ahead fails it will backtrack and find the next match, until a match for the entire expression is found or all possibilities are exhausted.
Example 1 : #
Let’s consider the input string: xoots
.
-
A non-atomic group in the expression
(xoo|xoot)s
applied toxoots
will:- match its 1st alternative
xoo
, then fail ass
does not immediately follow inxoot
, and backtrack to its 2nd alternative; - match its 2nd alternative
xoot
, then succeed as s immediately follows inxoots
, and stop.
- match its 1st alternative
-
An atomic group in the expression
(?>xoo|xoot)s
applied toxoots
will match its 1st alternativexoo
, then fail ass
does not immediately follow, and stop as backtracking is disallowed.
Example 2: #
Let’s consider the input string: bbabbbabbbbc
.
-
Without an atomic grouping for the regex pattern
/(.*|b*)[ac]/
, the string will have a single match which is the whole string, due to backtracking at the end to match[ac]
. -
In contrast, for the atomic pattern:
/((?>.*)|b*)[ac]/
, there will be only three matches to this regex, which arebba
,bbba
,bbbbc
.
Get hands-on with 1400+ tech skills courses.