A Collection of Useful Classes - Beginning Java

Java Reference

In-Depth Information

in the match, which is at m.start()-1 . The method then appends the replacement string, "goat" , to

newJoke .

After the loop finishes, the appendTail() method copies characters from joke to newJoke , starting with

the character following the last match at m.end() through to the end of joke . Thus, you end up with a

new string similar to the original, but which has each instance of "dog" replaced by "goat" .

You can use the search and replace capability to solve some string manipulation problems very easily. For

example, if you want to make sure that any sequence of one or more whitespace characters is replaced

by a single space, you can define the regular expression as "\\s+" and the replacement string as a single

space " " . To eliminate all spaces at the beginning of each line, you can use the expression "^\\s+"

and define the replacement string as empty, "" . You must specify Pattern.MULTILINE as the flag for the

compile() method for this to work.

Using Capturing Groups

Earlier you used the group() method for a Matcher object to retrieve the subsequence matched by the entire

pattern defined by the regular expression. The entire pattern represents what is called a capturing group be-

cause the Matcher object captures the subsequence corresponding to the pattern match. Regular expressions

can also define other capturing groups that correspond to parts of the pattern. Each pair of parentheses in

a regular expression defines a separate capturing group in addition to the group that the whole expression

defines. In the earlier example, you defined the regular expression by the following statement:

String regEx = "[+|-]?(\\d+(\\.\\d*)?)|(\\.\\d+)";

This defines three capturing groups other than the whole expression: one for the subexpression

(\\d+(\\.\\d*)?) , one for the subexpression (\\.\\d*) , and one for the subexpression (\\.\\d+) . The

Matcher object stores the subsequence that matches the pattern defined by each capturing group, and what's

more, you can retrieve them.

To retrieve the text matching a particular capturing group, you need a way to identify the capturing group

that you are interested in. To this end, capturing groups are numbered. The capturing group for the whole

regular expression is always number 0. Counting their opening parentheses from the left in the regular ex-

pression numbers the other groups. Thus, the first opening parenthesis from the left corresponds to capturing

group 1, the second corresponds to capturing group 2, and so on for as many opening parentheses as there

are in the whole expression. Figure 15-5 illustrates how the groups are numbered in an arbitrary regular ex-

pression.

FIGURE 15-5

Search WWH ::

Custom Search

Home