Tools.h++ Manual

21-68 104011 Tandem Computers Incorporated
21
.RWCRegexp
Synopsis
#include <rw/regexp.h>
RWCRegexp re(".*\.doc"); // Matches filename with suffix
".doc"
Description Class
RWCRegexp
represents a regular expression. The constructor "compiles"
the expression into a form that can be used more efficiently. The results can
then be used for string searches using class
RWCString
.
The regular expression (RE) is constucted as follows:
The following rules determine one-character REs that match a single character:
1.1 Any character that is not a special character (to be defined) matches
itself.
1.2 A backslash (\) followed by any special character matches the literal
character itself (i.e., this "escapes" the special character).
1.3 The "special characters" are:
+*?.[]^$
1.4 The period (.) matches any character except the newline. E.g., ".umpty"
matches either "Humpty" or "Dumpty".
1.5 A set of characters enclosed in brackets ([ ]) is a one-character RE that
matches any of the characters in that set. E.g., "[akm]" matches either an
"a", "k", or "m". A range of characters can be indicated with a dash. E.g.,
"[a–z]" matches any lower-case letter. However, if the first character of
the set is the caret (^), then the RE matches any character except those in
the set. It does not match the empty string. Example: [^akm] matches
any character except "a", "k", or "m". The caret loses its special meaning if
it is not the first character of the set.
The following rules can be used to build a multicharacter RE.
2.1 A one-character RE followed by an asterisk (*) matches zero or more
occurrences of the RE. Hence, [a–z]* matches zero or more lower-case
characters.