Java Reference
In-Depth Information
in the match, which is at m.start()-1 . The method then appends the replacement string, "goat" , to
newJoke .
After the loop finishes, the appendTail() method copies characters from joke to newJoke , starting with
the character following the last match at m.end() through to the end of joke . Thus, you end up with a
new string similar to the original, but which has each instance of "dog" replaced by "goat" .
You can use the search and replace capability to solve some string manipulation problems very easily. For
example, if you want to make sure that any sequence of one or more whitespace characters is replaced
by a single space, you can define the regular expression as "\\s+" and the replacement string as a single
space " " . To eliminate all spaces at the beginning of each line, you can use the expression "^\\s+"
and define the replacement string as empty, "" . You must specify Pattern.MULTILINE as the flag for the
compile() method for this to work.
Using Capturing Groups
Earlier you used the group() method for a Matcher object to retrieve the subsequence matched by the entire
pattern defined by the regular expression. The entire pattern represents what is called a capturing group be-
cause the Matcher object captures the subsequence corresponding to the pattern match. Regular expressions
can also define other capturing groups that correspond to parts of the pattern. Each pair of parentheses in
a regular expression defines a separate capturing group in addition to the group that the whole expression
defines. In the earlier example, you defined the regular expression by the following statement:
String regEx = "[+|-]?(\\d+(\\.\\d*)?)|(\\.\\d+)";
This defines three capturing groups other than the whole expression: one for the subexpression
(\\d+(\\.\\d*)?) , one for the subexpression (\\.\\d*) , and one for the subexpression (\\.\\d+) . The
Matcher object stores the subsequence that matches the pattern defined by each capturing group, and what's
more, you can retrieve them.
To retrieve the text matching a particular capturing group, you need a way to identify the capturing group
that you are interested in. To this end, capturing groups are numbered. The capturing group for the whole
regular expression is always number 0. Counting their opening parentheses from the left in the regular ex-
pression numbers the other groups. Thus, the first opening parenthesis from the left corresponds to capturing
group 1, the second corresponds to capturing group 2, and so on for as many opening parentheses as there
are in the whole expression. Figure 15-5 illustrates how the groups are numbered in an arbitrary regular ex-
pression.
FIGURE 15-5
 
 
Search WWH ::




Custom Search