Java Reference
In-Depth Information
An advantage of Java's implementation of regular expressions is that you can reuse a
Pattern
object to
create
Matcher
objects to search for the pattern in a variety of strings. To use the same pattern to
search another string, you just call the
matcher()
method for the
Pattern
object with the new string
as the argument. You then have a new
Matcher
object that you can use to search the new string.
You can also change the string that a
Matcher
object is to search by calling its
reset()
method with
a new string as the argument. For example:
matchHad.reset ("Had I known, I would not have eaten the haddock.");
This will replace the previous string,
sentence
, in the
Matcher
object so it is now capable of
searching the new string. Like the
matcher()
method in the
Pattern
class, the parameter type for
the
reset()
method is
CharSequence
so you can pass a reference of type
String
,
StringBuffer
,
or
java.nio.CharBuffer
to it.
Searching a String
Now we have a
Matcher
object, we can use it to search the string. Calling the
find()
method for the
Matcher
object will search the string for the next occurrence of the pattern. If it is found, the method stores
information about where it was found in the
Matcher
object and returns
true
. If it is not found it returns
false
. When the pattern has been found, calling the
start()
method for the
Matcher
object returns the
index position in the string where the first character in the pattern was found. Calling the
end()
method
returns the index position following the last character in the pattern. Both index values are returned as type
int
. You could therefore search for the first occurrence of the pattern like this:
if(m.find())
System.out.println("Pattern found. Start: "+m.start()+" End: "+m.end());
else
System.out.println("Pattern not found.");
Note that you must not call
start()
or
end()
for the
Matcher
object before you have succeeded in
finding the pattern. Until a pattern has been matched, the
Matcher
object is in an undefined state and
calling either of these methods will result in an exception of type
IllegalStateException
being thrown.
You will usually want to find all occurrences of a pattern in a string. When you call the
find()
method, searching starts at an index position in the string called the
append position
and stops either
when the pattern is found and the value
true
is returned, or when the end of the string is reached, in
which case the return value is
false
. The append position is initially zero, corresponding to the
beginning of the string, but it gets updated if the pattern is found. Each time the pattern is found, the
new append position will be the index position of the character immediately following the last character
in the text that matched the pattern. The next call to
find()
will start searching at this new append
position. Thus you can easily find all occurrences of the pattern by searching in a loop like this:
while(m.find())
System.out.println(" Start: "+m.start()+" End: "+m.end());
At the end of this loop the append position will be at the index position of the character following the
last occurrence of the pattern in the string. If you want to reset the append position back to zero, you
just call an overloaded version of
reset()
for the
Matcher
object that has no arguments: