HTML and CSS Reference
In-Depth Information
Quantifiers
A period or a literal character by itself always matches exactly one character. However, you can append a
quantifier to it to indicate that the character may appear a variable number of times.
Zero or One: ?
A normal character suffixed with a question mark indicates that the character appears only optionally (zero
times or once). For example, the regular expression
a?b
matches
ab
and
b
. The
a
is optional. The regular
expression
a?b?c?
matches
abc
,
ab
,
bc
,
ac
,
bc
,
a
,
b
, and
c
, as well as the empty string.
You can suffix a period with a question mark to indicate that any character may or may not appear. For
example, the regular expression
200.?
matches
200
as well as
2000
,
2001
,
200Z
, and
200!
.
Zero or More: *
An asterisk (*) suffix indicates that the preceding character appears zero or more times. For example,
a*b
matches
ab
,
aaab
,
aaaaab
, and
b
. However, it does not match
abb
or
acb
.
You can put an asterisk after a period to indicate that any number of any characters may appear. For example,
a.*b
matches
ab
,
aaab
,
aaaaab
,
abb
,
acb
,
a123b
, and
"a quick brown fox jumped into the tub"
.
Unlike UNIX shell globs, the asterisk alone does not match anything. It must be suffixed to something else. For
example, to list all the HTML files in the current working directory, you'd usually type something such as this:
$ ls *html
However, in most regular-expression dialects, the regular expression that matches all strings ending in
html
is
.*html
.
*html
without the initial period is a syntax error.
One or More: +
A plus sign (+) suffix indicates that a character appears one or more times. For example,
a+b
matches
ab
,
aaab
,
and
aaaaab
. However, it does not match a single
b
,
abb
, or
acb
.
Of course, a plus sign after a period indicates that one or more of any character is required. The regular
expression
a.+b
requires at least one character between the
a
and the
b
, so it matches
aaab
,
aaaaab
,
abb
,
acb
,
and
a123b
but not a simple
ab
.
A Specific Number of Times: {}
You can specify that a character must appear a specified number of times using curly braces. For example,
a{3}
is the same as the pattern
aaa
. It stands for exactly three
a
's in a row.
You can also specify a range of possible occurrences using a comma.
A{3,5}
allows three to five
A
s in a row.
That is, it matches
AAA
,
AAAA
, and
AAAAA
but not
AA
or
AAAAAA
.
You can omit the second, maximum value to indicate that at least a certain number of repetitions is required
but more are allowed. For example,
a{3,}
matches
aaa
,
aaaa
,
aaaaa
,
aaaaaa
, and any larger sequence of
a
's.
Table A.4
shows some more examples.
Search WWH ::
Custom Search