Java Reference
In-Depth Information
It is important to notice that quantifiers must follow a character or character class for
which it specifies the quantity. The regular expression to match any integer would be /\
d+/ , which specifies: “match one or more number of digits”. Is this solution for matching
integer correct? No, it is not. Suppose that the text is “This is text123 which contains 10
and 120”. If we match the pattern /\d+/ against the text, it will match against 123, 10, and
120. The following code demonstrates this:
var pattern = /\d+/g;
var text = "This is text123 which contains 10 and 120";
var result;
while((result = pattern.exec(text)) !== null) {
print("Matched '" + result[0] + "' at " + result.index +
". Next match will begin at " + pattern.lastIndex);
}
Matched '123' at 12. Next match will begin at 15
Matched '10' at 31. Next match will begin at 33
Matched '120' at 38. Next match will begin at 41
Notice that 123 is not used as an integer rather it is part of the word text123. If you
are looking for integers inside the text, certainly 123 in text123 does not qualify to be an
integer. You want to match all integers, which form a word in the text. You need to specify
that the match should be performed only on word boundaries, not inside text having
embedded integers. This is necessary to exclude integer 123 from the result. Table 4-21
lists metacharacters that match boundaries in a regular expression.
Table 4-21. List of Boundary Matchers Inside Regular Expressions
Boundary Matchers
Meaning
^
The beginning of a line
$
The end of a line
\b
A word boundary
\B
A nonword boundary
\A
The beginning of the input
\G
The end of previous match
\Z
The end of the input but for the final
terminator, if any
\z
The end of the input
 
 
Search WWH ::




Custom Search