Java Reference
In-Depth Information
Regular Expressions: Special Characters
You will be looking at three types of special characters in this section.
Text, Numbers, and Punctuation
The fi rst group of special characters you'll look at contains the character class's special characters.
Character class means digits, letters, and whitespace characters. The special characters are displayed in
the following table.
Character
Class
Characters It Matches
Example
\d
Any digit from 0 to 9
\d\d matches 72, but not aa or 7a
\D
Any character that is not a digit
\D\D\D matches abc, but not 123 or 8ef
\w
Any word character; that is, A-Z, a-z, 0-9,
and the underscore character (_)
\w\w\w\w matches Ab_2, but not £$%*
or Ab_@
\W
Any non-word character
\W matches @, but not a
\s
Any whitespace character
\s matches tab, return, formfeed, and
vertical tab
\S
Any non-whitespace character
\S matches A, but not the tab character
.
Any single character other than the new-
line character (\n)
. matches a or 4 or @
[...]
Any one of the characters between the
brackets[a-z] will match any character in
the range a to z
[abc] will match a or b or c, but noth-
ing else
[^...]
Any one character, but not one of those
inside the brackets
[^abc] will match any character
except a or b or c
[^a-z] will match any character that
is not in the range a to z
Note that uppercase and lowercase characters mean very different things, so you need to be extra care-
ful with case when using regular expressions.
Let's look at an example. To match a telephone number in the format 1-800-888-5474, the regular expres-
sion would be as follows:
\d-\d\d\d-\d\d\d-\d\d\d\d
You can see that there's a lot of repetition of characters here, which makes the expression quite unwieldy.
To make this simpler, regular expressions have a way of defi ning repetition. You'll see this a little later
in the chapter, but fi rst let's look at another example.
Search WWH ::




Custom Search