Java Reference
In-Depth Information
Regexes match anyplace possible in the string. Patterns followed by greedy quantifiers (the
only type that existed in traditional Unix regexes) consume (match) as much as possible
without compromising any subexpressions that follow; patterns followed by possessive
quantifiers match as much as possible without regard to following subexpressions; patterns
followed by reluctant quantifiers consume as few characters as possible to still get a match.
Also, unlike regex packages in some other languages, the Java regex package was designed
to handle Unicode characters from the beginning. And the standard Java escape sequence \u
nnnn is used to specify a Unicode character in the pattern. We use methods of
java.lang.Character to determine Unicode character properties, such as whether a given
character is a space. Again, note that the backslash must be doubled if this is in a Java string
that is being compiled because the compiler would otherwise parse this as “backslash-u” fol-
lowed by some numbers.
To help you learn how regexes work, I provide a little program called REDemo . [ 18 ] The code
for REDemo is too long to include in the topic; in the online directory regex of the darwinsys-
api repo, you will find REDemo.java , which you can run to explore how regexes work.
In the uppermost text box (see Figure 4-1 ), type the regex pattern you want to test. Note that
as you type each character, the regex is checked for syntax; if the syntax is OK, you see a
checkmark beside it. You can then select Match, Find, or Find All. Match means that the en-
tire string must match the regex, and Find means the regex must be found somewhere in the
string (Find All counts the number of occurrences that are found). Below that, you type a
string that the regex is to match against. Experiment to your heart's content. When you have
the regex the way you want it, you can paste it into your Java program. You'll need to escape
(backslash) any characters that are treated specially by both the Java compiler and the Java
regex package, such as the backslash itself, double quotes, and others (see the following side-
bar). Once you get a regex the way you want it, there is a “Copy” button (not shown in these
screenshots) to export the regex to the clipboard, with or without backslash doubling depend-
ing on how you want to use it.
 
Search WWH ::




Custom Search