Java Reference
In-Depth Information
call keyword. Nonetheless, opportunities for convoluted usage abound.
Keywords may be used as variable names, allowing the following:
if if then else = then;
The problem with reserved words is that if they are too numerous, they
may confuse inexperienced programmers, who may unknowingly choose an
identifier name that clashes with a reservedword. This usually causes a syntax
error in a program that “looks right” and in fact would be right if the symbol in
question was not reserved. COBOL is infamous for this problembecause it has
several hundred reserved words. For example, in COBOL, zero is a reserved
word. So is zeros.Soiszeroes!
In Section 3.5.1, we showed how to recognize reserved words by creating
distinct regular expressions for each. This approach was feasible because Lex
(and Flex) allows more than one regular expression to match a character se-
quence, with the earliest expression that matches taking precedence. Creating
regular expressions for each reserved word increases the number of states in
the transition table that a scanner generator creates. In as simple a language
as Pascal (which has only 35 reserved words), the number of states increases
from 37 to 165 [Gra88]. With the transition table in uncompressed form and
having 127 columns for ASCII characters (excluding null), the number of tran-
sition table entries increases from 4,699 to 20,955. This may not be a problem
with modern multimegabyte memories. Still, some scanner generators, such
as Flex, allow you to choose to optimize scanner size or scanner speed.
Exercise 18 establishes that any regular expression may be complemented
to obtain all strings not in the original regular expression. That is, A ,thecom-
plement of A ,isregularif A is. Using complementation of regular expressions
we can write a regular expression for nonreserved identifiers:
( ident | if | while | ...
)
That is, if we take the complement of the set containing reservedwords and all
nonidentifier strings, then we get all strings that are identifiers, excluding the
reserved words. Unfortunately, neither Lex nor Flex provides a complement
operator for regular expressions (ˆworks only on character sets).
We could just write down a regular expression directly, but this is too
complex to consider seriously. Suppose END is the only reserved word and
identifiers contain only letters. Then
(( LLL ) L + )
(( L E ) L )
( L ( L N ) L )
( LL ( L D ) L )
L |
( LL )
|
|
|
|
defines identifiers that are shorter or longer than three letters, that do not start
with E,thatarewithoutN in position two, and so on.
 
Search WWH ::




Custom Search