Java Reference
In-Depth Information
3. You call the find() method (or some other methods, as you later see) for the Matcher object to
search the string.
4. If the pattern is found, you query the Matcher object to discover the whereabouts of the pattern in
the string and other information relating to the match.
Although this is a straightforward process that is easy to code, the hard work is in defining the pattern to
achieve the result that you want. This is an extensive topic because in their full glory regular expressions are
immensely powerful and can be very complicated. There are topics devoted entirely to this, so my aim is to
give you enough of a bare-bones understanding of how regular expressions work that you are in a position
to look into the subject in more depth if you need to. Although regular expressions can look quite fearsome,
don't be put off. They are always built step-by-step, so although the result may look complicated and ob-
scure, they are not necessarily difficult to put together. Regular expressions are a lot of fun and a sure way
to impress your friends and maybe confound your enemies.
Defining Regular Expressions
You may not have heard of regular expressions before reading this topic and, therefore, may think you have
never used them. If so, you are almost certainly wrong. Whenever you search a directory for files of a partic-
ular type, "*.java" , for example, you are using a form of regular expression. However, to say that regular
expressions can do much more than this is something of an understatement. To get an understanding of what
you can do with regular expressions, you start at the bottom with the simplest kind of operation and work
your way up to some of the more complex problems they can solve.
Creating a Pattern
In its most elementary form, a regular expression just does a simple search for a substring. For example, if
you want to search a string for the word had , the regular expression is exactly that. The string that defines
this particular regular expression is "had" . Let's use this as a vehicle for understanding the programming
mechanism for using regular expressions. You create a Pattern object for the expression "had" like this:
Pattern had = Pattern.compile("had");
The static compile() method in the Pattern class returns a reference to a Pattern object that contains
the compiled regular expression. The method throws a PatternSyntaxException if the argument is invalid.
You don't have to catch this exception as it is a subclass of RuntimeException and therefore is unchecked,
but it is a good idea to do so to make sure the regular expression pattern is valid. The compilation process
stores the regular expression in a Pattern object in a form that is ready to be processed by a Matcher state-
machine.
A further version of the compile() method enables you to control more closely how the pattern is applied
when looking for a match. The second argument is a value of type int that specifies one or more of the
following flags that are defined in the Pattern class (shown in Table 15-4 ):
TABLE 15-4 : Flags Controlling Pattern Operation
FLAG
DESCRIPTION
Matches ignoring case, but assumes only US-ASCII characters are being matched.
CASE_INSENSITIVE
Enables the beginning or end of lines to be matched anywhere. Without this flag only the
beginning and end of the entire sequence is matched.
MULTILINE
 
 
Search WWH ::




Custom Search