Java Reference
In-Depth Information
distributed Lex clone. It produces scanners that are considerably faster than
the ones produced by Lex. It also provides options that allow the tuning of the
scanner size versus its speed, as well as some features that Lex does not have
(such as support for 8-bit characters). If Flex is available on your system, you
should use it instead of Lex.
Lex also has been implemented in languages other than C. JFlex [KD] is a
Lex-like scanner generator written in Java that generates Java scanner classes.
It is of particular interest to people writing compilers in Java. Versions of Lex
are also available for Ada and ML.
An interesting alternative to Lex is GLA (Generator for Lexical Analyz-
ers) [Gra88]. GLAtakes a description of a scanner based on regular expressions
and a library of common lexical idioms (such as “Pascal comment”) and pro-
duces a directly executable (that is, not transition table-driven) scanner written
in C. GLAwas designed with both ease of use and e
ciency of the generated
scanner in mind. Experiments show it to be typically twice as fast as Flex
and only slightly slower than a trivial program that reads and “touches” each
character in an input file. The scanners it produces are more than competitive
with the best hand-coded scanners.
Another tool that produces directly executable scanners is re2c [BC93].
The scanners it produces are easily adaptable to a variety of environments and
yet scanning speed is excellent.
Scanner generators are usually included as parts of complete suites of
compiler development tools. Other than those alreadymentioned, some of the
most widely used and highly recommended scanner generators are DLG (part
of thePCCTStools suite, [Par97]),CoCo/R[Moe90], an integrated scanner
/
parser
generator, andRex[GE91], part of the Karlsruhe
/
CoCoLabCocktail Toolbox.
3.7 Practical Considerations of Building Scanners
In this section, we discuss the practical considerations involved in building
real scanners for real programming languages. As one might expect, the finite
automaton model developed earlier in the chapter sometimes falls short and
must be supplemented. E
ciency concerns must be addressed. In addition,
some provision for error handling must be incorporated.
We discuss a number of potential problemareas. In each case, solutions are
weighed, particularly in conjunction with the Lex scanner generator discussed
in Section 3.5.
3.7.1 Processing Identifiers and Literals
In simple languages that have only global variables and declarations, the
scanner commonly will immediately enter an identifier into the symbol table,
 
 
 
Search WWH ::




Custom Search