Top-Down Parsing - Crafting a Compiler

Java Reference

In-Depth Information

function Predict( p : A

→X

...X m ): Set

1

ans ←

First(

X 1 ...X m )

1

if RuleDerivesEmpty( p )

then

2

ans ← ans ∪

Follow(A)

3

return ( ans )

end

Figure 5.1: Computation of Predict sets.

for all nonterminal symbols in a grammar, then a deterministic parser can be

constructed and the associated grammar is determined to be LL(1).

We next consider a rule p in greater detail and show how to compute

Predict( p ). Consider a production p : A

0, there

are no symbols on A's RHS, which is equivalent by convention to the rule

A

→X

...X m , m ≥

0. When m =

1

. As shown in Figure 5.1, the set of symbols that predict rule p is drawn

from one or both of the following:

→λ

•

The set of possible terminal symbols that are first produced in some

derivation from

X

...X m

1

•

Those terminal symbols that can follow A in some sentential form

At Marker 1 , the algorithm of Figure 5.1 initializes ans to First(

...X m ) ,

which is the set of terminal symbols that can appear first (leftmost) in any

derivation of

X

1

...X m . The algorithm for computing this set is given in Fig-

ure 4.8 on page 130. Marker 2 detects when

X

1

, using the results

of the algorithm presented in Figure 4.7 on page 129. RuleDerivesEmpty( p ) is

true if, and only if, production p can derive

X

1

...X m ⇒ λ

. In this case, Marker 3 includes

those symbols in Follow( A ) , as computed by the algorithm in Figure 4.11 on

page 135. Such symbols can follow A after A

λ

⇒ λ

. Thus, the function shown

in Figure 5.1 computes the set of length-1 token strings that predict rule p .By

convention,

λ

is not a terminal symbol, so it does not participate in any Predict

set.

In an LL(1) grammar, the productions for each nonterminal A must have

disjoint predict sets , as computed with one symbol of lookahead. Most pro-

gramming languages have LL(1) grammars, but there are some constructs that

requires special attention (Section 5.6). However, not all CFGss are LL(1). For

such grammars, the following may apply:

•

More lookahead may be needed, in which case the grammar is LL( k )for

some constant k >

1.

•

A more powerful parsing method may be required. Chapter 6 describes

such methods, but they also have limits in their applicability.

Crafting a Compiler

Search WWH ::

Custom Search

Home