Java Reference
In-Depth Information
3.3.1 Parsing by Recursive Descent
Parsing by recursive descent involves writing a method (or procedure) for parsing each non-
terminal according to the production rules that define that non-terminal. Depending on
the next unscanned input symbol, the method decides which rule to apply and then scans
any terminals (tokens) in the right-hand side by calling upon the
Scanner
, and parses any
non-terminals by recursively invoking the methods that parse them.
This is a strategy we use in parsing j-- programs. We already saw the method
compilationUnit()
that parses a j--
compilationUnit
in Section 1.4.3.
As another example, consider the rule defining qualiedIdentier:
(3.18)
qualiedIdentier ::=
<identifier>
f
.<identifier>
g
As we saw above, parsing a qualiedIdentier such as
java.lang.Class
is straightfor-
ward:
1. One looks at the next incoming token and if it is an identifier, scans it. If it is not an
identifier, then one raises an error.
2. Repeatedly, as long as the next incoming token is a period:
a. One scans the period.
b. One looks at the next incoming token and if it is an identifier, scans it. Otherwise,
one raises an error.
The method for implementing this not only parses the input, but also constructs an
AST node for recording the fully qualified name as a string. The method is quite simple but
introduces two helper methods:
have()
and
mustBe()
, which we use frequently throughout
the parser.
privateTypeNamequalifiedIdentifier(){
intline=scanner.token().line();
mustBe(IDENTIFIER);
StringqualifiedIdentifier=scanner.previousToken().image();
while(have(DOT)){
mustBe(IDENTIFIER);
qualifiedIdentifier+="."
+scanner.previousToken().image();
}
returnnewTypeName(line,qualifiedIdentifier);
}
The method
have()
is a predicate. It looks at the next incoming token (supplied by the
Scanner
), and if that token matches its argument, then it scans the token in the input and
returns
true
. Otherwise, it scans nothing and returns
false
.
The method
mustBe()
requires that the next incoming token match its argument. It
looks at the next token and if it matches its argument, it scans that token in the input.
Otherwise, it raises an error
3
.
As another example, consider our syntax for statements.
statement ::= block
j
<identifier>:
statement
j
if
parExpression statement [
else
statement]
j
while
parExpression statement
j
return
[expression]
;
j
;
j statementExpression
;
(3.19)
3
As we shall see below,
mustBe()
provides a certain amount of error recovery.
Search WWH ::
Custom Search