HTML and CSS Reference
In-Depth Information
Character Classes
Bracketed expressions enable you to define your own character classes that match some characters and not
others. Just place the characters you want to match inside square brackets.
For example, suppose you want to search for all hexadecimal digits. You can easily enumerate those characters
as [0123456789abcdefABCDEF] . This matches any one of those characters. Then, to match any potential
hexadecimal number, which will include one or more of these characters, you suffix the brackets with a plus sign
to form this regular expression:
[0123456789abcdefABCDEF]+
Of course, this also matches words composed purely of these 22 characters, such as Decaf and fed .
This is simple enough, but enumerating all the characters can be tedious. Sometimes what you want is a range.
Use a hyphen between two characters to indicate all characters from one's ASCII value to the other. For
example, [a-z] matches any lowercase letter. [A-Z] matches any uppercase letter.
You can combine ranges. [a-zA-Z] matches any upper- or lowercase letter. For example, we can match
hexadecimal numbers a little more simply as:
[0-9a-fA-F]+
You can negate a character set or range by placing a caret, ^, immediately after the opening bracket. For
example, [^a-z] matches any character except a lowercase ASCII letter. [^a-zA-Z] matches any character
except a lower- or uppercase ASCII letter.
Warning
Ranges are determined by character value, as measured in ASCII or Unicode. This works pretty much as
you expect within any one obvious range. However, beware of ranges that cross script, case, or type
boundaries, such as [a-Z] , [0-F] , or [A- O ] . These almost certainly don't do what you want or expect.
Table A.5 shows some more examples.
Table A.5. Character Classes (a.k.a. bracketed expressions)
Pattern
Matches
Example
</[a-zA-Z1-6]+>
All HTML end-tags
</p>
</TABLE>
</Span>
 
Search WWH ::




Custom Search