Java Reference
In-Depth Information
Matching Boundaries
Until now, you did not care about the location of the pattern match in the text. Sometimes, you may be interested in
knowing if the match occurred in the beginning of a line. You may be interested in finding and replacing a particular
match only if the match was found in a word, not as part of any word. For example, you may want to replace the word
apple ” inside a string with the word “ orange .” Suppose your string is “I have an apple and five pineapples”. Certainly,
you do not want to replace all occurrences of “ apple ” with “ orange ” in this string. If you do, your new string would be
“I have an orange and five pineoranges”. In fact, you want the new string to be “I have an orange and five pineapples”.
You want to match the word “ apple as a standalone word, not the part of any other word.
Table 14-5 lists all boundary matchers that can be used in a regular expression.
Table 14-5. List of Boundary Matchers Inside Regular Expressions
Boundary Matchers
Meaning
^
The beginning of a line
$
The end of a line
\b
A word boundary
\B
A non-word boundary
\A
The beginning of the input
\G
The end of previous match
\Z
The end of the input but for the final terminator, if any
\z
The end of the input
In Java, a word character is defined by [a-zA-Z_0-9]. A word boundary is a zero-width match that can match
the following:
Between a word character and a non-word character
Start of the string and a word character
A word character and the end of the string
A non-word boundary is also a zero-width match and it is the opposite of the word boundary. It matches
the following:
The empty string
Between two word characters
Between two non-word characters
The regular expression to match the word apple would be \bapple\b , which means the following: a word
boundary, the word apple , and a word boundary. Listing 14-3 demonstrates how to match a word boundary using a
regular expression.
Between a non-word character and the start or end of the string
 
Search WWH ::




Custom Search