Java Reference
In-Depth Information
System.out.println("Four score and seven years ago,");
Figure 3.16: An example of double buffering.
length, then we can extend the bu
er size, perhaps by using Java-styleVector
objects rather than arrays to implement bu
ff
ers.
We can speed up a scanner not only by doing block reads, but also by
avoiding unnecessary copying of characters. Because so many characters are
scanned, moving them from one place to another can be costly. A block read
enables direct reading into the scanning bu
ff
ff
er rather than into an intermediate
input bu
ff
er. As characters are scanned, we need not copy characters from the
input bu
er unlesswe recognize a tokenwhose textmust be saved or processed
(an identifier or a literal). With care, we can process the token's text directly
from the input bu
ff
er.
At some point, using a profiling tool such as qpt, prof, gprof,orpixie
may allow you to find unexpected performance bottlenecks in a scanner.
ff
3.7.6 Lexical Error Recovery
A character sequence that cannot be scanned into any valid token results in a
lexical error . Although uncommon, such errors must be handled by a scanner.
It is unreasonable to stop compilation because of what is often a minor error,
so usually we try some sort of lexical error recovery . Two approaches come to
mind:
1. Delete the characters read so far and restart scanning at the next unread
character.
2. Delete the first character read by the scanner and resume scanning at the
character following it.
Both approaches are reasonable. The former can be done by resetting the
scanner and beginning scanning anew. The latter is a bit harder to do but
also is a bit safer (because fewer characters are immediately deleted). Non-
deleted characters can be rescanned using the bu
ff
ering mechanism described
previously for scanner backup.
In most cases, a lexical error is caused by the appearance of some illegal
character, which usually appears as the beginning of a token. In this case, the
two approaches work equally well. The e
ects of lexical error recovery might
well create a syntax error, which will be detected and handled by the parser.
Consider . . .for$tnight.... The$ would terminate scanning of for.Since
no valid token begins with $, it would be deleted. Then tnight would be
ff
 
 
Search WWH ::




Custom Search