Java Reference
In-Depth Information
the simplest LR(0) parser, which lacks su
cient power for most applications.
After examining problems that arise in LR(0) construction we turn to the more
powerful LR(1) parsing method and its variants.
When LRparser construction fails, the associated grammar may be ambigu-
ous (as discussed in Section 6.4.1). For other grammars, the parser may require
more information about the unconsumed input string (lookahead). In fact,
some grammars require an unbounded amount of lookahead (Section 6.4.2). In
either case, the parser-generator identifies inadequate parsing states that are
useful for resolving the problem. While it can be shown that there can be
no algorithm to determine if a grammar is ambiguous, Section 6.4 describes
techniques that work well in practice.
6.3 LR(0) Table Construction
The table-construction methods discussed in this chapter analyze a grammar
to devise a parse table suitable for use in the generic parser presented in Fig-
ure 6.3. Each symbol in the terminal and nonterminal alphabets corresponds
to a column of the table. The analysis proceeds by exploring the state-space
of the parser. Each state corresponds to a row of the parse table. Because the
state-space is necessarily finite, this exploration must terminate at some point.
After the parse table's rows have been determined, analysis then attempts
to fill in the cells of the table. Because we are interested only in deterministic
parsers, each table cell can hold one entry. An important outcome of the LR
construction methods is the determination of inadequate states —states that
lack su
cient information to place at most one parsing action in each column.
We begin by considering LR(0) table construction for the grammar shown
in Figure 6.2. In constructing the parse table, we are required to consider the
parser's progress in recognizing a rule's right-hand side (RHS). For example,
consider the rule E
plus E E. Prior to reducing the RHS to E, each component
of the RHS must be found. A plus must be identified, then two Esmustbe
found. Once these three symbols are on top-of-stack, then it is possible for the
parser to apply the reduction and replace the three symbols with the left-hand
side (LHS) symbol E.
To keep track of the parser's progress, we introduce the notion of an LR (0)
item —a grammar production with a bookmark that indicates the current
progress through the production's RHS. The bookmark is analogous to the
“progress bar” present in many applications, which indicates the completed
fraction of a task. Figure 6.8 shows the progress of the bookmark symbol (
)
through all of the possible LR(0) items for the production E
plus E E. A
fresh item has its marker at the extreme left, as in E
→•
plus E E.When
the marker is at the extreme right, as in E
plus E E
, we say the item
is reducible . A rule of the form A
→λ
deserves special consideration. The
 
 
Search WWH ::




Custom Search