Java Reference
In-Depth Information
number of characters to look-ahead through. This "horizon" value is
treated as a transparent, non-anchoring bound (see the
Matcher
class for
details
Section 13.3.4
on page
329
). A horizon of zero means there is no
look-ahead limit.
The
skip
method can be used to skip over the input that matches a given
pattern. As with
findInLine
and
findWithinHorizon
, it ignores the scan-
ner's delimiters when looking for the pattern. The skipped input is not
returned, rather
skip
returns the scanner itself so that invocations can
be chained together.
Exercise 22.7
: Rewrite
readCSVTable
so that the number of cells of data
expected is passed as an argument.
Exercise 22.8
: As it stands,
readCSVTable
is both too strict and too le-
nient on the input format it expects. It is too strict because an empty
line at the end of the input will cause the
IOException
to be thrown. It is
too lenient because a line of input with more than three commas will not
cause an exception. Rectify both of these problems.
Exercise 22.9
: Referring back to the discussion of efficiency of regular
expressions on page
329
, devise at least four patterns that will parse a
line of comma-separated-values. (Hint: In addition to the suggestion on
page
329
also consider the use of greedy versus non-greedy quantifi-
ers.) Write a benchmark program that compares the efficiency of each
pattern, and be sure that you test with both short strings between com-
mas and very long strings.
22.5.3. Using
Scanner
Scanner
and
StreamTokenizer
have some overlap in functionality, but they
have quite different operational models.
Scanner
is based on regular ex-
pressions and so can match tokens based on whatever regular expres-
sions you devise. However, some seemingly simple tasks can be difficult
to express in terms of regular expression patterns. On the other hand,
StreamTokenizer
basically processes input a character at a time and uses
the defined character classes to identify words, numbers, whitespace,