Java Reference
In-Depth Information
/
Assume CurrentChar contains the first character to be scanned
/
State StartState
while true do
NextState
T[ State , CurrentChar ]
if NextState =
error
then break
State NextState
CurrentChar ← read
()
if State AcceptingStates
then
/
Return or process the valid token
/
else
/
Signal a lexical error
/
Figure 3.3: Scanner driver interpreting a transition table.
1. A Fortran-like real literal (which requires either digits on one or both
sides of a decimal point or just a string of digits) can be defined as
( D + (
( D . D + )
RealLit =
λ|.
))
|
which corresponds to the DFA shown in Figure 3.5(a).
2. Another form of identifier consists of letters, digits, and underscores. It
begins with a letter and allows no adjacent or trailing underscores. It
may be defined as
ID = L ( L | D ) ( ( L | D ) + )
This definition includes identifiers such as sum or unit cost but ex-
cludes one, two ,andgrand total. The corresponding DFA is shown
in Figure 3.5(b).
Transducers
The scanners shown in Figures 3.3 and 3.4 begin processing characters at some
point in the input stream. They finish either by accepting the token for which
they are programmed or by signaling a lexical error. It is often useful for a
scanner to process the input stream not only to recognize tokens but also to
associate a semantic value with the discovered tokens. For example, a scanner
can find that the input 431 is an integer constant, but it is useful to associate
the value of 431 with that token.
An FA that analyzes or transforms its input beyond simply accepting to-
kens is called a transducer . The FAs shown in Figure 3.5 recognize a particular
kind of constant and identifier. A transducer that recognizes constants might
 
Search WWH ::




Custom Search