Additional Syntax Specifiers in Basic Regular Expressions

As well as the basic matching rules and bracket expressions, you can use the following syntax within a Basic Regular Expression to control what it matches:

Syntax Specifier Description
\(expression\) Matches whatever expression matches. You only need to enclose an expression in these delimiters to use operators (such as * or +) on it and to denote subexpressions for backreferencing (explained below).
\n Matches the same string that was matched by the nth preceding expression enclosed in \( \). This is called a backreference. n can be 1 through 9. For example, \(ab\)\1 matches abab, but does not match ac. If fewer than n subexpressions precede \n, the backreference is not valid.
expression * Matches zero or more consecutive occurrences of what expression matches. expression can be a single character or collating symbol, a subexpression, or a backreference. For example, [ab]* matches ab and ababab ; b*cd matches characters 3 to 7 of cabbbcdeb .
expression\{m\} Matches exactly m occurrences of what expression matches. expression can be a single character or collating symbol, a subexpression, or a backreference. For example, c\{3\} matches characters 5 through 7 of ababccccd (the first 3 c characters only).
expression\{m,\} Matches at least m occurrences of what expression matches. expression can be a single character or collating symbol, a subexpression, or a backreference. For example, \(ab\)\{3,\} matches abababab, but does not match ababac.
expression\{m,u\} Matches any number of occurrences, between m and u inclusive, of what expression matches. expression can be a single character or collating symbol, a subexpression, or a backreference. For example, bc\{1,3\} matches characters 2 through 4 of abccd and characters 3 through 6 of abbcccccd.
^expression Matches only sequences that match expression that start at the first character of a string or after a new-line character if the REG_NEWLINE flag was specified for the regcomp function. For example, ^ab matches ab in the string abcdef, but does not match it in the string cdefab. The expression can be the entire RE or any subexpression of it.
When ^ is the first character of a subexpression, other implementations could interpret it as a literal character. To ensure portability, avoid using ^ at the beginning of a subexpression; to use it as a literal character, precede it with a backslash.
expression$ Matches only sequences that match expression that end the string or that precede the new-line character if the REG_NEWLINE flag was specified for the regcomp function. For example, ab$ matches ab in the string cdefab but does not match it in the string abcdef. The expression must be the entire RE.
When $ is the last character of a subexpression, it is treated as a literal character. Other implementations could interpret is as described above. To ensure portability, avoid using $ at the end of a subexpression; to use it as a literal character, precede it with a backslash.
^expression$ Matches only an entire string, or an entire line if the REG_NEWLINE flag was specified for the regcomp function. For example, ^abcde$ matches only abcde.



Regular Expressions