Java Reference
In-Depth Information
FLAG DESCRIPTION
UNICODE_CASE
When this is specified in addition to
CASE_INSENSITIVE
, case-insensitive matching is
consistent with the Unicode standard.
DOTALL
Makes the expression (which you see shortly) match any character, including line termin-
ators.
LITERAL
Causes the string specifying a pattern to be treated as a sequence of literal characters, so
escape sequences, for example, are not recognized as such.
CANON_EQ
Matches taking account of canonical equivalence of combined characters. For example,
some characters that have diacritics may be represented as a single character or as a single
character with a diacritic followed by a diacritic character. This flag treats these as a
match.
COMMENTS
Allows whitespace and comments in a pattern. Comments in a pattern start with
#
so from
the first
#
to the end of the line is ignored.
UNIX_LINES
Enables UNIX lines mode, where only
'\n'
is recognized as a line terminator.
UNICODE_CHARACTER_CLASS
Enables the Unicode version of predefined character classes.
All these flags are unique single-bit values within a value of type
int
so you can combine them by
ORing them together or by simple addition. For example, you can specify the
CASE_INSENSITIVE
and the
UNICODE_CASE
flags with the following expression:
Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE
Or you can write this as:
Pattern.CASE_INSENSITIVE + Pattern.UNICODE_CASE
Beware of using addition when you want to add a flag to a variable representing an existing set of flags. If
the flag already exists, addition produces the wrong result because adding the two corresponding bits results
in a carry to the next bit. ORing always produces the correct result.
If you want to match
"had"
ignoring case, you could create the pattern with the following statement:
Pattern had = Pattern.compile("had",
Pattern.CASE_INSENSITIVE
);
In addition to the exception thrown by the first version of the method, this version throws an
Illeg-
alArgumentException
if the second argument has bit values set that do not correspond to any of the flag
constants defined in the
Pattern
class.
Creating a Matcher
After you have a
Pattern
object, you can create a
Matcher
object that can search a specified string, like
this:
String sentence = "Smith, where Jones had had 'had', had had 'had had'.";
Matcher matchHad = had.matcher(sentence);
The first statement defines the string
sentence
that you want to search. To create the
Matcher
object,
you call the
matcher()
method for the
Pattern
object with the string to be analyzed as the argument. This
returns a
Matcher
object that can analyze the string that was passed to it. The parameter for the
match-
er()
method is actually of type
CharSequence
. This is an interface that is implemented by the
String
,