Parsing - Introduction to Compiler Construction in a Java World

Java Reference

In-Depth Information

Algorithm 3.11 Constructing the LR(1) Parsing Tables for a Context-Free Grammar

Input: a context-free grammar G = (N;T;S;P)

Output: the LF(1) tables Action and Goto

1. Compute the LR(1) canonical collection c = fs 0 ;s 1 ;:::;s n g. State i of the parser

corresponds to the item set s i . State 0, corresponding to the item set s 0 , which

contains the item [S 0 ::= S, # ], is the parser's initial state

2. The Action table is constructed as follows:

a. For each transition, goto(si, i , a ) = s j , where a is a terminal, set Action[i, a ] =

sj. The s stands for \shift"

b. If the item set s k contains the item [S 0 ::= S, # ], set Action[k, # ] = accept

c. For all item sets si i , if si i contains an item of the form [Y ::= , a ], set Action[i,

a ] = rp, where p is the number corresponding to the rule Y ::= . The r stands

for \reduce"

d. All undefined entries in Action are set to error

3. The Goto table is constructed as follows:

a. For each transition, goto(si, i , Y ) = s j , where Y is a non-terminal, set Goto[i,

Y ] = j.

b. All undefined entries in Goto are set to error

If all entries in the Action table are unique, then the grammar G is said to be LR(1).

Example. Let us say we are computing the Action and Goto tables for the arithmetic

expression grammar in (3.31). We apply Algorithm 3.10 for computing the LR(1) canon-

ical collection. This produces the twenty-two item sets shown in the table before. Adding

the extra production rule and enumerating the production rules gives us the augmented

grammar in (3.41).

0. E 0 ::= E

1. E ::= E + T

2. E ::= T

3. T ::= T * F

4. T ::= F

5. F ::= ( E )

6. F ::= id

(3.41)

We must now apply steps 2 and 3 of Algorithm 3.11 for constructing the tables Action

and Goto. Both tables will each have twenty-two rows for the twenty-two states, derived

from the twenty-two item sets in the LR(1) canonical collection: 0 to 21. The Action table

will have six columns, one for each terminal symbol: + , * , ( , ) , id , and the terminator # .

The Goto table will have three columns, one for each of the original non-terminal symbols:

E, T, and F. The newly added non-terminal E 0 does not play a role in the parsing process.

The tables are illustrated in Figure 3.7. To see how these tables are constructed, let us

derive the entries for several states.

First, let us consider the first four states of the Action table:

Search WWH ::

Custom Search

Home