Java Reference
In-Depth Information
13.3.4. Regions
A Matcher looks for matches in the character sequence that it is given
as input. By default, the entire character sequence is considered when
looking for a match. You can control the region of the character se-
quence to be used, through the method region which takes a starting
index and an ending index to define the subsequence in the input char-
acter sequence. The methods regionStart and regionEnd return, respect-
ively, the current start index and the current end index.
You can control whether a region is considered to be the true start and
end of the input, so that matching with the beginning or end of a line will
work, by invoking useAnchoringBounds with an argument of true (the de-
fault). If you don't want the region to match with the line anchors then
use false . The method hasAnchoringBounds will return the current setting.
Similarly, you can control whether the bounds of the region are trans-
parent to matching methods that want to look-ahead, look-behind, or
detect a boundary. By default bounds are opaquethat is, they will ap-
pear to be hard bounds on the input sequencebut you can change that
with useTransparentBounds . The hasTransparentBounds method returns the
current setting.
13.3.5. Efficiency
Suppose you want to parse a string into two parts that are separated
by a comma. The pattern (.*),(.*) is clear and straightforward, but it is
not necessarily the most efficient way to do this. The first .* will attempt
to consume the entire input. The matcher will have to then back up to
the last comma and then expand the rest into the second .* . You could
help this along by being clear that a comma is not part of the group:
([^,]*),([^,]*) . Now it is clear that the matcher should only go so far
as the first comma and stop, which needs no backing up. On the other
hand, the second expression is somewhat less clear to the casual user
of regular expressions.
 
Search WWH ::




Custom Search