|
|
RE2 regular expression syntax reference |
|
|
------------------------------------- |
|
|
|
|
|
Single characters: |
|
|
. any character, possibly including newline (s=true) |
|
|
[xyz] character class |
|
|
[^xyz] negated character class |
|
|
\d Perl character class |
|
|
\D negated Perl character class |
|
|
[[:alpha:]] ASCII character class |
|
|
[[:^alpha:]] negated ASCII character class |
|
|
\pN Unicode character class (one-letter name) |
|
|
\p{Greek} Unicode character class |
|
|
\PN negated Unicode character class (one-letter name) |
|
|
\P{Greek} negated Unicode character class |
|
|
|
|
|
Composites: |
|
|
xy «x» followed by «y» |
|
|
x|y «x» or «y» (prefer «x») |
|
|
|
|
|
Repetitions: |
|
|
x* zero or more «x», prefer more |
|
|
x+ one or more «x», prefer more |
|
|
x? zero or one «x», prefer one |
|
|
x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more |
|
|
x{n,} «n» or more «x», prefer more |
|
|
x{n} exactly «n» «x» |
|
|
x*? zero or more «x», prefer fewer |
|
|
x+? one or more «x», prefer fewer |
|
|
x?? zero or one «x», prefer zero |
|
|
x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer |
|
|
x{n,}? «n» or more «x», prefer fewer |
|
|
x{n}? exactly «n» «x» |
|
|
x{} (== x*) NOT SUPPORTED vim |
|
|
x{-} (== x*?) NOT SUPPORTED vim |
|
|
x{-n} (== x{n}?) NOT SUPPORTED vim |
|
|
x= (== x?) NOT SUPPORTED vim |
|
|
|
|
|
Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}» |
|
|
reject forms that create a minimum or maximum repetition count above 1000. |
|
|
Unlimited repetitions are not subject to this restriction. |
|
|
|
|
|
Possessive repetitions: |
|
|
x*+ zero or more «x», possessive NOT SUPPORTED |
|
|
x++ one or more «x», possessive NOT SUPPORTED |
|
|
x?+ zero or one «x», possessive NOT SUPPORTED |
|
|
x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED |
|
|
x{n,}+ «n» or more «x», possessive NOT SUPPORTED |
|
|
x{n}+ exactly «n» «x», possessive NOT SUPPORTED |
|
|
|
|
|
Grouping: |
|
|
(re) numbered capturing group (submatch) |
|
|
(?P<name>re) named & numbered capturing group (submatch) |
|
|
(?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED |
|
|
(?'name're) named & numbered capturing group (submatch) NOT SUPPORTED |
|
|
(?:re) non-capturing group |
|
|
(?flags) set flags within current group; non-capturing |
|
|
(?flags:re) set flags during re; non-capturing |
|
|
(?#text) comment NOT SUPPORTED |
|
|
(?|x|y|z) branch numbering reset NOT SUPPORTED |
|
|
(?>re) possessive match of «re» NOT SUPPORTED |
|
|
re@> possessive match of «re» NOT SUPPORTED vim |
|
|
%(re) non-capturing group NOT SUPPORTED vim |
|
|
|
|
|
Flags: |
|
|
i case-insensitive (default false) |
|
|
m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false) |
|
|
s let «.» match «\n» (default false) |
|
|
U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false) |
|
|
Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»). |
|
|
|
|
|
Empty strings: |
|
|
^ at beginning of text or line («m»=true) |
|
|
$ at end of text (like «\z» not «\Z») or line («m»=true) |
|
|
\A at beginning of text |
|
|
\b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other) |
|
|
\B not at ASCII word boundary |
|
|
\G at beginning of subtext being searched NOT SUPPORTED pcre |
|
|
\G at end of last match NOT SUPPORTED perl |
|
|
\Z at end of text, or before newline at end of text NOT SUPPORTED |
|
|
\z at end of text |
|
|
(?=re) before text matching «re» NOT SUPPORTED |
|
|
(?!re) before text not matching «re» NOT SUPPORTED |
|
|
(?<=re) after text matching «re» NOT SUPPORTED |
|
|
(?<!re) after text not matching «re» NOT SUPPORTED |
|
|
re& before text matching «re» NOT SUPPORTED vim |
|
|
re@= before text matching «re» NOT SUPPORTED vim |
|
|
re@! before text not matching «re» NOT SUPPORTED vim |
|
|
re@<= after text matching «re» NOT SUPPORTED vim |
|
|
re@<! after text not matching «re» NOT SUPPORTED vim |
|
|
\zs sets start of match (= \K) NOT SUPPORTED vim |
|
|
\ze sets end of match NOT SUPPORTED vim |
|
|
\%^ beginning of file NOT SUPPORTED vim |
|
|
\%$ end of file NOT SUPPORTED vim |
|
|
\%V on screen NOT SUPPORTED vim |
|
|
\%# cursor position NOT SUPPORTED vim |
|
|
\%'m mark «m» position NOT SUPPORTED vim |
|
|
\%23l in line 23 NOT SUPPORTED vim |
|
|
\%23c in column 23 NOT SUPPORTED vim |
|
|
\%23v in virtual column 23 NOT SUPPORTED vim |
|
|
|
|
|
Escape sequences: |
|
|
\a bell (== \007) |
|
|
\f form feed (== \014) |
|
|
\t horizontal tab (== \011) |
|
|
\n newline (== \012) |
|
|
\r carriage return (== \015) |
|
|
\v vertical tab character (== \013) |
|
|
\* literal «*», for any punctuation character «*» |
|
|
\123 octal character code (up to three digits) |
|
|
\x7F hex character code (exactly two digits) |
|
|
\x{10FFFF} hex character code |
|
|
\C match a single byte even in UTF-8 mode |
|
|
\Q...\E literal text «...» even if «...» has punctuation |
|
|
|
|
|
\1 backreference NOT SUPPORTED |
|
|
\b backspace NOT SUPPORTED (use «\010») |
|
|
\cK control char ^K NOT SUPPORTED (use «\001» etc) |
|
|
\e escape NOT SUPPORTED (use «\033») |
|
|
\g1 backreference NOT SUPPORTED |
|
|
\g{1} backreference NOT SUPPORTED |
|
|
\g{+1} backreference NOT SUPPORTED |
|
|
\g{-1} backreference NOT SUPPORTED |
|
|
\g{name} named backreference NOT SUPPORTED |
|
|
\g<name> subroutine call NOT SUPPORTED |
|
|
\g'name' subroutine call NOT SUPPORTED |
|
|
\k<name> named backreference NOT SUPPORTED |
|
|
\k'name' named backreference NOT SUPPORTED |
|
|
\lX lowercase «X» NOT SUPPORTED |
|
|
\ux uppercase «x» NOT SUPPORTED |
|
|
\L...\E lowercase text «...» NOT SUPPORTED |
|
|
\K reset beginning of «$0» NOT SUPPORTED |
|
|
\N{name} named Unicode character NOT SUPPORTED |
|
|
\R line break NOT SUPPORTED |
|
|
\U...\E upper case text «...» NOT SUPPORTED |
|
|
\X extended Unicode sequence NOT SUPPORTED |
|
|
|
|
|
\%d123 decimal character 123 NOT SUPPORTED vim |
|
|
\%xFF hex character FF NOT SUPPORTED vim |
|
|
\%o123 octal character 123 NOT SUPPORTED vim |
|
|
\%u1234 Unicode character 0x1234 NOT SUPPORTED vim |
|
|
\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim |
|
|
|
|
|
Character class elements: |
|
|
x single character |
|
|
A-Z character range (inclusive) |
|
|
\d Perl character class |
|
|
[:foo:] ASCII character class «foo» |
|
|
\p{Foo} Unicode character class «Foo» |
|
|
\pF Unicode character class «F» (one-letter name) |
|
|
|
|
|
Named character classes as character class elements: |
|
|
[\d] digits (== \d) |
|
|
[^\d] not digits (== \D) |
|
|
[\D] not digits (== \D) |
|
|
[^\D] not not digits (== \d) |
|
|
[[:name:]] named ASCII class inside character class (== [:name:]) |
|
|
[^[:name:]] named ASCII class inside negated character class (== [:^name:]) |
|
|
[\p{Name}] named Unicode property inside character class (== \p{Name}) |
|
|
[^\p{Name}] named Unicode property inside negated character class (== \P{Name}) |
|
|
|
|
|
Perl character classes (all ASCII-only): |
|
|
\d digits (== [0-9]) |
|
|
\D not digits (== [^0-9]) |
|
|
\s whitespace (== [\t\n\f\r ]) |
|
|
\S not whitespace (== [^\t\n\f\r ]) |
|
|
\w word characters (== [0-9A-Za-z_]) |
|
|
\W not word characters (== [^0-9A-Za-z_]) |
|
|
|
|
|
\h horizontal space NOT SUPPORTED |
|
|
\H not horizontal space NOT SUPPORTED |
|
|
\v vertical space NOT SUPPORTED |
|
|
\V not vertical space NOT SUPPORTED |
|
|
|
|
|
ASCII character classes: |
|
|
[[:alnum:]] alphanumeric (== [0-9A-Za-z]) |
|
|
[[:alpha:]] alphabetic (== [A-Za-z]) |
|
|
[[:ascii:]] ASCII (== [\x00-\x7F]) |
|
|
[[:blank:]] blank (== [\t ]) |
|
|
[[:cntrl:]] control (== [\x00-\x1F\x7F]) |
|
|
[[:digit:]] digits (== [0-9]) |
|
|
[[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]) |
|
|
[[:lower:]] lower case (== [a-z]) |
|
|
[[:print:]] printable (== [ -~] == [ [:graph:]]) |
|
|
[[:punct:]] punctuation (== [!-/:-@[-`{-~]) |
|
|
[[:space:]] whitespace (== [\t\n\v\f\r ]) |
|
|
[[:upper:]] upper case (== [A-Z]) |
|
|
[[:word:]] word characters (== [0-9A-Za-z_]) |
|
|
[[:xdigit:]] hex digit (== [0-9A-Fa-f]) |
|
|
|
|
|
Unicode character class names--general category: |
|
|
C other |
|
|
Cc control |
|
|
Cf format |
|
|
Cn unassigned code points NOT SUPPORTED |
|
|
Co private use |
|
|
Cs surrogate |
|
|
L letter |
|
|
LC cased letter NOT SUPPORTED |
|
|
L& cased letter NOT SUPPORTED |
|
|
Ll lowercase letter |
|
|
Lm modifier letter |
|
|
Lo other letter |
|
|
Lt titlecase letter |
|
|
Lu uppercase letter |
|
|
M mark |
|
|
Mc spacing mark |
|
|
Me enclosing mark |
|
|
Mn non-spacing mark |
|
|
N number |
|
|
Nd decimal number |
|
|
Nl letter number |
|
|
No other number |
|
|
P punctuation |
|
|
Pc connector punctuation |
|
|
Pd dash punctuation |
|
|
Pe close punctuation |
|
|
Pf final punctuation |
|
|
Pi initial punctuation |
|
|
Po other punctuation |
|
|
Ps open punctuation |
|
|
S symbol |
|
|
Sc currency symbol |
|
|
Sk modifier symbol |
|
|
Sm math symbol |
|
|
So other symbol |
|
|
Z separator |
|
|
Zl line separator |
|
|
Zp paragraph separator |
|
|
Zs space separator |
|
|
|
|
|
Unicode character class names--scripts: |
|
|
Adlam |
|
|
Ahom |
|
|
Anatolian_Hieroglyphs |
|
|
Arabic |
|
|
Armenian |
|
|
Avestan |
|
|
Balinese |
|
|
Bamum |
|
|
Bassa_Vah |
|
|
Batak |
|
|
Bengali |
|
|
Bhaiksuki |
|
|
Bopomofo |
|
|
Brahmi |
|
|
Braille |
|
|
Buginese |
|
|
Buhid |
|
|
Canadian_Aboriginal |
|
|
Carian |
|
|
Caucasian_Albanian |
|
|
Chakma |
|
|
Cham |
|
|
Cherokee |
|
|
Chorasmian |
|
|
Common |
|
|
Coptic |
|
|
Cuneiform |
|
|
Cypriot |
|
|
Cyrillic |
|
|
Deseret |
|
|
Devanagari |
|
|
Dives_Akuru |
|
|
Dogra |
|
|
Duployan |
|
|
Egyptian_Hieroglyphs |
|
|
Elbasan |
|
|
Elymaic |
|
|
Ethiopic |
|
|
Georgian |
|
|
Glagolitic |
|
|
Gothic |
|
|
Grantha |
|
|
Greek |
|
|
Gujarati |
|
|
Gunjala_Gondi |
|
|
Gurmukhi |
|
|
Han |
|
|
Hangul |
|
|
Hanifi_Rohingya |
|
|
Hanunoo |
|
|
Hatran |
|
|
Hebrew |
|
|
Hiragana |
|
|
Imperial_Aramaic |
|
|
Inherited |
|
|
Inscriptional_Pahlavi |
|
|
Inscriptional_Parthian |
|
|
Javanese |
|
|
Kaithi |
|
|
Kannada |
|
|
Katakana |
|
|
Kayah_Li |
|
|
Kharoshthi |
|
|
Khitan_Small_Script |
|
|
Khmer |
|
|
Khojki |
|
|
Khudawadi |
|
|
Lao |
|
|
Latin |
|
|
Lepcha |
|
|
Limbu |
|
|
Linear_A |
|
|
Linear_B |
|
|
Lisu |
|
|
Lycian |
|
|
Lydian |
|
|
Mahajani |
|
|
Makasar |
|
|
Malayalam |
|
|
Mandaic |
|
|
Manichaean |
|
|
Marchen |
|
|
Masaram_Gondi |
|
|
Medefaidrin |
|
|
Meetei_Mayek |
|
|
Mende_Kikakui |
|
|
Meroitic_Cursive |
|
|
Meroitic_Hieroglyphs |
|
|
Miao |
|
|
Modi |
|
|
Mongolian |
|
|
Mro |
|
|
Multani |
|
|
Myanmar |
|
|
Nabataean |
|
|
Nandinagari |
|
|
New_Tai_Lue |
|
|
Newa |
|
|
Nko |
|
|
Nushu |
|
|
Nyiakeng_Puachue_Hmong |
|
|
Ogham |
|
|
Ol_Chiki |
|
|
Old_Hungarian |
|
|
Old_Italic |
|
|
Old_North_Arabian |
|
|
Old_Permic |
|
|
Old_Persian |
|
|
Old_Sogdian |
|
|
Old_South_Arabian |
|
|
Old_Turkic |
|
|
Oriya |
|
|
Osage |
|
|
Osmanya |
|
|
Pahawh_Hmong |
|
|
Palmyrene |
|
|
Pau_Cin_Hau |
|
|
Phags_Pa |
|
|
Phoenician |
|
|
Psalter_Pahlavi |
|
|
Rejang |
|
|
Runic |
|
|
Samaritan |
|
|
Saurashtra |
|
|
Sharada |
|
|
Shavian |
|
|
Siddham |
|
|
SignWriting |
|
|
Sinhala |
|
|
Sogdian |
|
|
Sora_Sompeng |
|
|
Soyombo |
|
|
Sundanese |
|
|
Syloti_Nagri |
|
|
Syriac |
|
|
Tagalog |
|
|
Tagbanwa |
|
|
Tai_Le |
|
|
Tai_Tham |
|
|
Tai_Viet |
|
|
Takri |
|
|
Tamil |
|
|
Tangut |
|
|
Telugu |
|
|
Thaana |
|
|
Thai |
|
|
Tibetan |
|
|
Tifinagh |
|
|
Tirhuta |
|
|
Ugaritic |
|
|
Vai |
|
|
Wancho |
|
|
Warang_Citi |
|
|
Yezidi |
|
|
Yi |
|
|
Zanabazar_Square |
|
|
|
|
|
Vim character classes: |
|
|
\i identifier character NOT SUPPORTED vim |
|
|
\I «\i» except digits NOT SUPPORTED vim |
|
|
\k keyword character NOT SUPPORTED vim |
|
|
\K «\k» except digits NOT SUPPORTED vim |
|
|
\f file name character NOT SUPPORTED vim |
|
|
\F «\f» except digits NOT SUPPORTED vim |
|
|
\p printable character NOT SUPPORTED vim |
|
|
\P «\p» except digits NOT SUPPORTED vim |
|
|
\s whitespace character (== [ \t]) NOT SUPPORTED vim |
|
|
\S non-white space character (== [^ \t]) NOT SUPPORTED vim |
|
|
\d digits (== [0-9]) vim |
|
|
\D not «\d» vim |
|
|
\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim |
|
|
\X not «\x» NOT SUPPORTED vim |
|
|
\o octal digits (== [0-7]) NOT SUPPORTED vim |
|
|
\O not «\o» NOT SUPPORTED vim |
|
|
\w word character vim |
|
|
\W not «\w» vim |
|
|
\h head of word character NOT SUPPORTED vim |
|
|
\H not «\h» NOT SUPPORTED vim |
|
|
\a alphabetic NOT SUPPORTED vim |
|
|
\A not «\a» NOT SUPPORTED vim |
|
|
\l lowercase NOT SUPPORTED vim |
|
|
\L not lowercase NOT SUPPORTED vim |
|
|
\u uppercase NOT SUPPORTED vim |
|
|
\U not uppercase NOT SUPPORTED vim |
|
|
\_x «\x» plus newline, for any «x» NOT SUPPORTED vim |
|
|
|
|
|
Vim flags: |
|
|
\c ignore case NOT SUPPORTED vim |
|
|
\C match case NOT SUPPORTED vim |
|
|
\m magic NOT SUPPORTED vim |
|
|
\M nomagic NOT SUPPORTED vim |
|
|
\v verymagic NOT SUPPORTED vim |
|
|
\V verynomagic NOT SUPPORTED vim |
|
|
\Z ignore differences in Unicode combining characters NOT SUPPORTED vim |
|
|
|
|
|
Magic: |
|
|
(?{code}) arbitrary Perl code NOT SUPPORTED perl |
|
|
(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl |
|
|
(?n) recursive call to regexp capturing group «n» NOT SUPPORTED |
|
|
(?+n) recursive call to relative group «+n» NOT SUPPORTED |
|
|
(?-n) recursive call to relative group «-n» NOT SUPPORTED |
|
|
(?C) PCRE callout NOT SUPPORTED pcre |
|
|
(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED |
|
|
(?&name) recursive call to named group NOT SUPPORTED |
|
|
(?P=name) named backreference NOT SUPPORTED |
|
|
(?P>name) recursive call to named group NOT SUPPORTED |
|
|
(?(cond)true|false) conditional branch NOT SUPPORTED |
|
|
(?(cond)true) conditional branch NOT SUPPORTED |
|
|
(*ACCEPT) make regexps more like Prolog NOT SUPPORTED |
|
|
(*COMMIT) NOT SUPPORTED |
|
|
(*F) NOT SUPPORTED |
|
|
(*FAIL) NOT SUPPORTED |
|
|
(*MARK) NOT SUPPORTED |
|
|
(*PRUNE) NOT SUPPORTED |
|
|
(*SKIP) NOT SUPPORTED |
|
|
(*THEN) NOT SUPPORTED |
|
|
(*ANY) set newline convention NOT SUPPORTED |
|
|
(*ANYCRLF) NOT SUPPORTED |
|
|
(*CR) NOT SUPPORTED |
|
|
(*CRLF) NOT SUPPORTED |
|
|
(*LF) NOT SUPPORTED |
|
|
(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre |
|
|
(*BSR_UNICODE) NOT SUPPORTED pcre |
|
|
|
|
|
|