Java Reference
In-Depth Information
DOTALL
Makes the expression
.
(which we will see shortly) match any character,
including line terminators.
Matches taking account of canonical equivalence of combined
characters. For instance, some characters that have diacritics may be
represented as a single character or as a single character with a diacritic
followed by a diacritic character. This flag will treat these as a match.
CANON
_
EQ
COMMENTS
Allows whitespace and comments in a pattern. Comments in a pattern
start with
#
so from the first
#
to the end of the line will be ignored.
UNIX
_
LINES
Enables Unix lines mode where only '
\n
' is recognized as a line
terminator.
All these flags are single bit values so you can combine them by ANDing them together or by simple
addition. For instance, you can specify the
CASE
_
INSENSITIVE
and the
UNICODE
_
CASE
flags with
the expression:
Pattern.CASE
_
INSENSITIVE & Pattern.UNICODE
_
CASE
Or you can write this as:
Pattern.CASE
_
INSENSITIVE + Pattern.UNICODE
_
CASE
If we wanted to match "
had
" ignoring case, we could create the pattern with the statement:
Pattern had = Pattern.compile("had",
Pattern.CASE
_
INSENSITIVE
);
In addition to the exception thrown by the first version of the
compile()
method, this version will
throw an exception of type
IllegalArgumentException
if the second argument has bit values set
that do not correspond to one of the flag constants defined in the
Pattern
class.
Creating a Matcher
Once we have a
Pattern
object, we can create a
Matcher
object that can search a particular
string, like this:
String sentence = "Smith, where Jones had had 'had', had had 'had had'."
Matcher matchHad = had.matcher(sentence);
The first statement defines the string,
sentence
, that we want to search. To create the
Matcher
object, we
call the
matcher()
method for the
Pattern
object with the string to be analyzed as the argument. This will
return a
Matcher
object that can analyze the string that was passed to it. The parameter for the
matcher()
method is actually of type
CharSequence
. This is an interface that is implemented by both the
String
and
StringBuffer
classes so you can pass either type of reference to the method. The
java.nio.CharBuffer
class also implements
CharSequence
so you can pass the contents of a
CharBuffer
to the method too. This means that if you use a
CharBuffer
to hold character data you have
read from a file, you can pass the data directly to the
matcher()
method to be searched.