Parsing - Introduction to Compiler Construction in a Java World

Java Reference

In-Depth Information

faster running programs so it would be advantageous if we might be able to reduce the

number of states. LALR(1) is a parsing method that does just this.

If you look at the LR(1) canonical collection of states in Figure 3.8, which we computed

for our example grammar (3.31), you will nd that many states are virtually identical|they

dier only in their lookahead tokens. Their cores|the core of an item is just the rule and

position marker portion|are identical. For example, consider states s 2 and s 9 :

s 2 = f[E ::= T , +/# ],

[T ::= T * F, +/*/# ]g

s 9 = f[E ::= T , +/) ],

[T ::= T * F, +/*/) ]g

They differ only in the lookaheads # and ) . Their cores are the same:

s 2 = f[E ::= T ],

[T ::= T * F]g

s 9 = f[E ::= T ],

[T ::= T * F]g

What happens if we merge them, taking a union of the items, into a single state, s 2:9 ?

Because the cores are identical, taking a union of the items merges the lookaheads:

s 2:9 = f[E ::= T , +/)/# ],

[T ::= T * F, +/*/)/# ]g

Will this cause the parser to carry out actions that it is not supposed to? Notice that

the lookaheads play a role only in reductions and never in shifts. It is true that the new

state may call for a reduction that was not called for by the original LR(1) states; yet an

error will be detected before any progress is made in scanning the input.

Similarly, looking at the states in Figure 3.8, one can merge states s 3 and s 10 , s 2 and

s 9 , s 4 and s 11 , s 5 and s 12 , s 6 and s 16 , s 2 and s 17 , s 8 and s 18 , s 13 and s 19 , s 14 and s 20 , and

s 15 and s 21 . This allows us to reduce the number of states by ten. In general, for bona fide

programming languages, one can reduce the number of states by an order of magnitude.

LALR(1) Table Construction

There are two approaches to computing the LALR(1) states, and so the LALR(1) parsing

tables.

LALR(1)tableconstructionfromtheLR(1)states

In the first approach, we first compute the full LR(1) canonical collection of states, and

then perform the state merging operation illustrated above for producing what we call the

LALR(1) canonical collection of states.

Search WWH ::

Custom Search

Home