Java Reference
In-Depth Information
regular expressions: special Characters
You look at three types of special characters in this section.
text, Numbers, and punctuation
The first group of special characters contains the character class's special characters. Character
class means digits, letters, and whitespace characters. The special characters are displayed in the
following table:
CharaCter Class
CharaCters it matChes
example
Any digit from 0 to 9
\d\d matches 72, but not aa or 7a.
\d
Any character that is not a digit
\D\D\D matches abc, but not 123
or 8ef.
\D
Any word character; that is, A-Z,
a-z, 0-9, and the underscore
character ( _ )
\w\w\w\w matches Ab_2, but not
£$%* or Ab_@.
\w
Any non‐word character
\W matches @, but not a.
\W
Any whitespace character
\s matches tab, return, formfeed,
and vertical tab.
\s
Any non‐whitespace character
\S matches A, but not the tab
character.
\S
Any single character other than the
newline character ( \n )
. matches a or 4 or @.
.
Any one of the characters between
the brackets [a‐z] matches any
character in the range a to z
[abc] matches a or b or c, but
nothing else.
[. . .]
Any one character, but not one of
those inside the brackets
[^abc] matches any character
except a or b or c.
[^a‐z] matches any character that
is not in the range a to z.
[^. . .]
Note that uppercase and lowercase characters mean very different things, so you need to be extra
careful with case when using regular expressions.
Let's look at an example. To match a telephone number in the format 1‐800‐888‐5474, the regular
expression would be as follows:
\d-\d\d\d-\d\d\d-\d\d\d\d
You can see that there's a lot of repetition of characters here, which makes the expression quite
unwieldy. To make this simpler, regular expressions have a way of defining repetition. You see this a
little later in the chapter, but first let's look at another example.
 
Search WWH ::




Custom Search