Java Reference
In-Depth Information
Of course, this approach of first generating the LR(1) canonical collection of states and
then merging states to produce the LALR(1) collection consumes a great deal of space. But
once the tables are constructed, they are small and workable. An alternative approach, which
does not consume so much space, is to do the merging as the LR(1) states are produced.
Mergingthestatesastheyareconstructed
Our algorithm for computing the LALR(1) canonical collection is a slight variation on
Algorithm 3.10; it is Algorithm 3.13.
Algorithm 3.13 Computing the LALR(1) Collection of States
Input: a context-free grammar G = (N;T;S;P)
Output: the canonical LALR(1) collection of states c = fs 0 ;s 1 ;:::;s n g
Dene an augmented grammar G 0 which is G with the added non-terminal S 0 and added
production rule S 0 ::= S, where S is G's start symbol. The following steps apply to G 0 .
Enumerate the production rules beginning at 0 for the newly added production
c fs 0 g, where s 0 = closure(f[S 0 ::= S, # ]g)
repeat
for each s in c, and for each symbol X 2 T [N do
if goto(s;X) 6= ; and goto(s;X) 2 c then
Add s = goto(s;X) to c
Check to see if the cores of an existing state in c are equivalent to the cores of s
If so, merge s with that state
Otherwise, add s to the collection c
end if
end for
until no new states are added to c
There are other enhancements we can make to Algorithm 3.13 to conserve even more
space. For example, as the states are being constructed, it is enough to store their kernels.
The closures may be computed when necessary, and even these may be cached for each
non-terminal symbol.
LALR(1) Conflicts
There is the possibility that the LALR(1) table for a grammar may have conflicts where the
LR(1) table does not. Therefore, while it should be obvious that every LALR(1) grammar
is an LR(1) grammar, not every LR(1) grammar is an LALR(1) grammar.
How can these conflicts arise? A shift-reduce conflict cannot be introduced by merging
two states, because we merge two states only if they have the same core items. If the merged
state has an item that suggests a shift on a terminal a and another item that suggests a
reduce on the lookahead a , then at least one of the two original states must have contained
both items, and so caused a conflict.
On the other hand, merging states can introduce reduce-reduce conflicts. An example
arises in grammar given in Exercise 3.20. Even though LALR(1) grammars are not as pow-
erful as LR(1) grammars, they are suciently powerful to describe most programming lan-
guages. This, together with their small (relative to LR) table size, makes the LALR(1) fam-
ily of grammars an excellent candidate for the automatic generation of parsers. Stephen C.
Johnson's YACC, for \Yet Another Compiler-Compiler" [Johnson, 1975], based on LALR(1)
techniques, was probably the first practical bottom-up parser generator. GNU has developed
an open-source version called Bison [Donnelly and Stallman, 2011].
 
Search WWH ::




Custom Search