Parsing - Introduction to Compiler Construction in a Java World

Java Reference

In-Depth Information

E ) E + T

) E + T * F

) E + T *id

) E + F *id

) E +id*id

) T +id*id

) F +id*id

) id+id*id

That it is in reverse makes sense because this is a bottom-up parse. The question arises:

How does the parser know when to shift and when to reduce? When reducing, how many

symbols on top of the stack play a role in the reduction? And, when reducing, by which rule

does it make its reduction?

For example, in the derivation above, when the stack contains # E + T and the next

incoming token is a * , how do we know that we are to shift (the * onto the stack) rather

than reduce either the E + T to an E or the T to an E?

Notice two things:

1. Ignoring the terminator # , the stack configuration combined with the unscanned input

stream represents a sentential form in a right-most derivation of the input.

2. The part of the sentential form that is reduced to a non-terminal is always on top of

the stack. So all actions take place at the top of the stack. We either shift a token

onto the stack, or we reduce what is already there.

We call the sequence of terminals on top of the stack that are reduced to a single non-

terminal at each reduction step the handle. More formally, in a right-most derivation,

S ) Y w ) w ) uw; where uw is the sentence,

the handle is the rule Y ::= and a position in the right sentential form w where

may be replaced by Y to produce the previous right sentential form Y w in a right-most

derivation from the start symbol S. Fortunately, there are a finite number of possible handles

that may appear on top of the stack.

So, when a handle appears on top of the stack,

Stack

Input

#

w

we reduce that handle ( to Y in this case).

Now if is the sequence X 1 ;X 2 ;:::;X n , then we call any subsequence, X 1 ;X 2 ;:::;X i ,

for i n a viable prex. Only viable prexes may appear on the top of the parse stack. If

there is not a handle on top of the stack and shifting the first unscanned input token from

the input to the stack results in a viable prefix, a shift is called for.

3.4.2 LR(1) Parsing

One way to drive the shift/reduce parser is by a kind of DFA that recognizes viable prefixes

and handles. The tables that drive our LR(1) parser are derived from this DFA.

Search WWH ::

Custom Search

Home