Java Reference
In-Depth Information
%%
f
{ return(FLOATDCL); }
i
{ return(INTDCL); }
p
{ return(PRINT); }
%%
Figure 3.7: A Lex definiton for ac's reserved words.
declarations
%%
regular expression rules
%%
subroutine definitions
Figure 3.8: The structure of Lex definiton files.
produced by the yacc parser generator (see Chapter 7), is often used to de-
fine shared token codes. A Lex token specification consists of three sections
delimited by the pair %%. The general form of a Lex specification is shown in
Figure 3.8.
In the simple example shown in Figure 3.7, we use only the second sec-
tion, in which regular expressions and corresponding C code are specified.
The regular expressions are simple single-character strings that match only
themselves. The code executed returns a constant value representing the ap-
propriate ac token.
We could quote the strings representing the reserved keywords (f, i,or
p), but since these strings contain no delimiters or operators, quoting them
is unnecessary. If you want to quote such strings to avoid any chance of
misinterpretation, that is allowed in Lex.
3.5.2 The Character Class
Our specification so far is incomplete. None of the other tokens in ac have
been correctly handled, particularly identifiers and numbers. To do this, we
introduce a useful concept: the character class . A character class is a set of
characters treated identically in a token definition. Thus, in the definition of
an ac identifier, all letters (except f, i,andp) form a class since any of them
can be used to form an identifier. Similarly, in a number, any of the ten digits
characters can be used.
 
 
Search WWH ::




Custom Search