Java Reference
In-Depth Information
The recursive-descent and table-driven LL(1) parsers constructed in this chap-
ter are based on Predict sets. These sets are, in turn, based on First and Follow
information that is computed globally for a grammar. In particular, recall that
the production A
is predicted by the symbols in Follow(A).
Suppose that A occurs in the productions V
→λ
→
vAband W
→
wAc,as
shown in Figure 5.25. For this grammar, the production A
→λ
is predicted
by the symbols in Follow(A)
. Examining the grammar in greater
detail, we see that the application of A
= {
b
,
c
}
should be followed only by b if
the derivation stems from V. However, if the derivation stems from W,then
A should be followed only by c. As described in this chapter, LL(1) parsing
cannot distinguish between contexts calling for application of A
→λ
→λ
. fthe
next input token is b or c, the production A
is applied, even though the
next input token may not be acceptable. If the wrong symbol is present,
the error is detected later, when matching the symbol after A in V
→λ
→
vAbor
W
wAc. Exercise 23 considers how such errors can be caught sooner by
full
LL(1) parsers, which are more powerful than the
strong
LL(1) parsers defined
in this chapter.
→
The LR(1) parsers described in Chapter 6 are formally more powerful than
the LL(1) parsers. However, the continued popularity of LL(1) parsers can be
attributed, in part, to their superior error diagnosis and error recovery. Be-
cause of the predictive nature of an LL(1) leftmost parse, the parser can easily
extend the parsed portion of a faulty program into a syntactically valid pro-
gram. When an error is detected, the parser can produce messages informing
the programmer of what tokens were
expected
so that the parse could have
continued.
A simple and uniform approach to error recovery in LL(1) parsers is dis-
cussed by Wirth [Wir76]. When applied to recursive-descent parsers, the
parsing procedures described in Section 5.3 are augmented with an extra pa-
rameter that receives a set of terminal symbols. Consider the parsingprocedure
A(
ts
,
termset
) associated with some nonterminal A. When A is called during
operation of the recursive-descent parser, any symbol passed via
termset
can
legitimately serve as the lookahead symbol when
this
instance of A returns. For
example, consider the grammar and Wirth-style parsing procedures shown in
Figure 5.26. Error recovery is placed in E so that if an a is not found, the input
is advanced until a symbol is found that
can
follow E. The set of symbols
passed to E includes those symbols passed to S as well as a closing bracket (if
called from Marker
18
) or a closing parenthesis (if called from Marker
19
).
If E detects an error, then the input is advanced until a symbol in
termset
is