Replacement Reference |
Characters |
Matched Text & Backreferences |
Context & Case Conversion |
Conditionals |
Feature | Syntax | Description | Example | JGsoft | .NET | Java | Perl | PCRE | PCRE2 | PHP | Delphi | R | JavaScript | VBScript | XRegExp | Python | Ruby | std::regex | Boost | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | Oracle | XML | XPath |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Comment | (?#comment) | Everything between (?# and ) is ignored by the regex engine. | a(?#foobar)b matches ab | YES | YES | no | YES | YES | YES | YES | YES | YES | no | no | YES | YES | YES | no | ECMA | YES | no | no | no | no | no | no | no |
Branch reset group | (?|regex) | If the regex inside the branch reset group has multiple alternatives with capturing groups, then the capturing group numbers are the same in all the alternatives. | (x)(?|(a)|(bc)|(def))\2 matches xaa, xbcbc, or xdefdef with the first group capturing x and the second group capturing a, bc, or def | V2 | no | no | 5.10 | 7.2 | YES | 5.2.4 | YES | YES | no | no | no | no | no | no | ECMA 1.42–1.85 | no | no | no | no | no | no | no | no |
Atomic group | (?>regex) | Atomic groups prevent the regex engine from backtracking back into the group after a match has been found for the group. If the remainder of the regex fails, the engine may backtrack over the group if a quantifier or alternation makes it optional. But it will not backtrack into the group to try other permutations of the group. | a(?>bc|b)c matches abcc but not abc | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | YES | no | ECMA | no | no | no | no | no | no | no | no |
Positive lookahead | (?=regex) | Matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match. In a pattern like one(?=two)three, both two and three have to match at the position where the match of one ends. | t(?=s) matches the second t in streets. | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA | ECMA | YES | no | no | no | no | no | no | no |
Negative lookahead | (?!regex) | Similar to positive lookahead, except that negative lookahead only succeeds if the regex inside the lookahead fails to match. | t(?!s) matches the first t in streets. | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | ECMA | ECMA | YES | no | no | no | no | no | no | no |
Positive lookbehind | (?<=regex) | Matches at a position if the pattern inside the lookbehind can be matched ending at that position. | (?<=s)t matches the first t in streets. | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | YES | 1.9 | no | ECMA | no | no | no | no | no | no | no | no |
Negative lookbehind | (?<!regex) | Matches at a position if the pattern inside the lookbehind cannot be matched ending at that position. | (?<!s)t matches the second t in streets. | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | YES | 1.9 | no | ECMA | no | no | no | no | no | no | no | no |
Lookbehind | (?<=regex|longer regex) | Alternatives inside lookbehind can differ in length. | (?<=is|e)t matches the second and fourth t in twisty streets. | YES | YES | YES | 5.30 | YES | YES | YES | YES | YES | YES | n/a | YES | no | 1.9 | n/a | ECMA 1.38–1.43 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
Lookbehind | (?<=x{n,m}) | Quantifiers with a finite maximum number of repetitions can be used inside lookbehind. | (?<=s\w{1,7})t matches only the fourth t in twisty streets. | YES | YES | 6 4 fail | 5.30 fail | no | no | no | no | no | YES | n/a | YES | no | no | n/a | no | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
Lookbehind | (?<=regex) | The full regular expression syntax can be used inside lookbehind. | (?<=s\w+)t matches only the fourth t in twisty streets. | YES | YES | 13 | no | no | no | no | no | no | YES | n/a | YES | no | no | n/a | no | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
Lookbehind | (group)(?<=\1) | Backreferences can be used inside lookbehind. Syntax prohibited in lookbehind is also prohibited in the referenced capturing group. | (\w).+(?<=\1) matches twisty street in twisty streets. | YES | YES | no | no | no | 10.23 | 7.3.0 | no | 4.0.0 | YES | n/a | YES | 3.5 | no | n/a | no | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
Keep text out of the regex match | \K | The text matched by the part of the regex to the left of the \K is omitted from the overall regex match. Other than that the regex is matched normally from left to right. Capturing groups to the left of the \K capture as usual. | st matches only the first t in streets. | V2 | no | no | 5.10 | 7.2 | YES | 5.2.4 | YES | YES | no | no | no | no | 2.0 | no | ECMA 1.42–1.85 | no | no | no | no | no | no | no | no |
Lookaround conditional | (?(?=regex)then|else) where (?=regex) is any valid lookaround and then and else are any valid regexes | If the lookaround succeeds, the “then” part must match for the overall regex to match. If the lookaround fails, the “else” part must match for the overall regex to match. The lookaround is zero-length. The “then” and “else” parts consume their matches like normal regexes. | (?(?<=a)b|c) matches the second b and the first c in babxcac | YES | YES | no | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no | ECMA | no | no | no | no | no | no | no | no |
Implicit lookahead conditional | (?(regex)then|else) where regex, then, and else are any valid regexes and regex is not the name of a capturing group | If “regex” is not the name of a capturing group, then it is interpreted as a lookahead as if you had written (?(?=regex)then|else). If the lookahead succeeds, the “then” part must match for the overall regex to match. If the lookahead fails, the “else” part must match for the overall regex to match. The lookaround is zero-length. The “then” and “else” parts consume their matches like normal regexes. | (?(\d{2})7|c) matches the first 7 and the c in 747c | no | YES | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no |
Named conditional | (?(name)then|else) where name is the name of a capturing group and then and else are any valid regexes | If the capturing group with the given name took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (?<one>a)?(?(one)b|c) matches ab, the first c, and the second c in babxcac | YES | YES | no | no | 6.7 | YES | 5.2.0 | YES | YES | no | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
Named conditional | (?(<name>)then|else) where name is the name of a capturing group and then and else are any valid regexes | If the capturing group with the given name took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (?<one>a)?(?(<one>)b|c) matches ab, the first c, and the second c in babxcac | V2 | no | no | 5.10 | 7.0 | YES | 5.2.2 | YES | YES | no | no | no | no | 2.0 | no | ECMA 1.42–1.85 | no | no | no | no | no | no | no | no |
Named conditional | (?('name')then|else) where name is the name of a capturing group and then and else are any valid regexes | If the capturing group with the given name took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (?'one'a)?(?('one')b|c) matches ab, the first c, and the second c in babxcac | V2 | no | no | 5.10 | 7.0 | YES | 5.2.2 | YES | YES | no | no | no | no | 2.0 | no | ECMA 1.42–1.85 | no | no | no | no | no | no | no | no |
Conditional | (?(1)then|else) where 1 is the number of a capturing group and then and else are any valid regexes | If the referenced capturing group took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (a)?(?(1)b|c) matches ab, the first c, and the second c in babxcac | YES | YES | no | YES | YES | YES | YES | YES | YES | no | no | no | YES | 2.0 | no | ECMA | no | no | no | no | no | no | no | no |
Relative conditional | (?(-1)then|else) where -1 is a negative integer and then and else are any valid regexes | Conditional that tests the capturing group that can be found by counting as many opening parentheses of named or numbered capturing groups as specified by the number from right to left starting immediately before the conditional. If the referenced capturing group took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (a)?(?(-1)b|c) matches ab, the first c, and the second c in babxcac | V2 | no | no | no | 7.2 | YES | 5.2.4 | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no |
Forward conditional | (?(+1)then|else) where +1 is a positive integer and then and else are any valid regexes | Conditional that tests the capturing group that can be found by counting as many opening parentheses of named or numbered capturing groups as specified by the number from left to right starting at the “then” part of conditional. If the referenced capturing group took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | ((?(+1)b|c)(d)?){2} matches cc and cdb in bdbdccxcdcxcdb | V2 | no | no | no | 7.2 | YES | 5.2.4 | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no | no | no |
Conditional | (?(+1)then|else) where 1 is the number of a capturing group and then and else are any valid regexes | The + is ignored and the number is taken as an absolute reference to a capturing group. If the referenced capturing group took part in the match attempt thus far, the “then” part must match for the overall regex to match. If the capturing group did not take part in the match thus far, the “else” part must match for the overall regex to match. | (a)?(?(+1)b|c) matches ab, the first c, and the second c in babxcac | no | no | no | no | no | no | no | no | no | no | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
Feature | Syntax | Description | Example | JGsoft | .NET | Java | Perl | PCRE | PCRE2 | PHP | Delphi | R | JavaScript | VBScript | XRegExp | Python | Ruby | std::regex | Boost | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | Oracle | XML | XPath |
| Quick Start | Tutorial | Tools & Languages | Examples | Reference | Book Reviews |
| Introduction | Table of Contents | Quick Reference | Characters | Basic Features | Character Classes | Shorthands | Anchors | Word Boundaries | Quantifiers | Unicode | Capturing Groups & Backreferences | Named Groups & Backreferences | Special Groups | Mode Modifiers | Recursion & Balancing Groups |
| Characters | Matched Text & Backreferences | Context & Case Conversion | Conditionals |
Page URL: https://www.regular-expressions.info/refadv.html
Page last updated: 16 August 2024
Site last updated: 06 November 2024
Copyright © 2003-2024 Jan Goyvaerts. All rights reserved.