Java Reference
In-Depth Information
Regular expressions, or regexes for short, provide a concise and precise specification of pat-
terns to be matched in text.
As another example of the power of regular expressions, consider the problem of bulk-updat-
ing hundreds of files. When I started with Java, the syntax for declaring array references was
baseType arrayVariableName[] . For example, a method with an array argument, such as
every program's main method, was commonly written as:
public static void main(String args[]) {
But as time went by, it became clear to the stewards of the Java language that it would be
better to write it as baseType[] arrayVariableName . For example:
public static void main(String[] args) {
This is better Java style because it associates the “array-ness” of the type with the type itself,
rather than with the local argument name, and the compiler now accepts both modes. I
wanted to change all occurrences of main written the old way to the new way. I used the pat-
tern main(String [a-z] with the grep utility described earlier to find the names of all the
files containing old-style main declarations (i.e., main(String followed by a space and a
name character rather than an open square bracket). I then used another regex-based Unix
tool, the stream editor sed , in a little shell script to change all occurrences in those files from
main(String *([a-z][a-z]*)[] to main(String[] $1 (the syntax used here is discussed
later in this chapter). Again, the regex-based approach was orders of magnitude faster than
doing it interactively, even using a reasonably powerful editor such as vi or emacs , let alone
trying to use a graphical word processor.
Historically, the syntax of regexes has changed as they get incorporated into more tools and
more languages, so the exact syntax in the previous examples is not exactly what you'd use
in Java, but it does convey the conciseness and power of the regex mechanism. [ 17 ]
As a third example, consider parsing an Apache web server logfile, where some fields are de-
limited with quotes, others with square brackets, and others with spaces. Writing ad-hoc code
to parse this is messy in any language, but a well-crafted regex can break the line into all its
constituent fields in one operation (this example is developed in Program: Apache Logfile
Parsing ) .
 
Search WWH ::




Custom Search