Java Reference
In-Depth Information
Listing 14-3.
Matching a Word Boundary
// MatchBoundary.java
package com.jdojo.regex;
public class MatchBoundary {
public static void main(String[] args) {
// Prepare regular expression. Use \\b to get \b inside the string literal.
String regex = "\\bapple\\b";
String replacementStr = "orange";
String inputStr = "I have an apple and five pineapples";
String newStr = inputStr.replaceAll(regex, replacementStr);
System.out.println("Regular Expression: " + regex);
System.out.println("Input String: " + inputStr);
System.out.println("Replacement String: " + replacementStr);
System.out.println("New String: " + newStr);
}
}
Regular Expression: \bapple\b
Input String: I have an apple and five pineapples
Replacement String: orange
New String: I have an orange and five pineapples
There are two boundary matchers:
^
(beginning of a line) and
\A
(beginning of the input). Input string may
consist of multiple lines. In that case,
\A
will match the beginning of the entire input string, whereas
^
will match the
beginning of each line in the input. For example, regular expression
"^The"
will match
the
input string, which is in the
beginning of any line.
Groups and Back Referencing
You can treat multiple characters as a unit by using them as a group. A group is created inside a regular expression by
enclosing one or more characters inside parentheses.
(ab)
,
ab(z)
,
ab(ab)(xyz)
,
(the((is)(is)))
are examples of
groups. Each group in a regular expression has a group number. The group number starts at 1. The
Matcher
class has
a method
groupCount()
that returns the number of groups in the pattern associated with the
Matcher
instance. There
is a special group called group 0 (zero). It refers to the entire regular expression. The group 0 is not reported by the
groupCount()
method.
How is each group numbered? Each left parenthesis inside a regular expression marks the start of a new group.
Table
14-6
lists some examples of group numbering in a regular expression. Note that I have also listed group 0 for all
regular expressions although it is not reported by the
groupCount()
method of the
Matcher
class. The last example in
the list shows that the group 0 is present, even if there are no explicit groups present in the regular expression.