Java Reference
In-Depth Information
3.3.1 Parsing by Recursive Descent
Parsing by recursive descent involves writing a method (or procedure) for parsing each non-
terminal according to the production rules that define that non-terminal. Depending on
the next unscanned input symbol, the method decides which rule to apply and then scans
any terminals (tokens) in the right-hand side by calling upon the Scanner , and parses any
non-terminals by recursively invoking the methods that parse them.
This is a strategy we use in parsing j-- programs. We already saw the method
compilationUnit() that parses a j-- compilationUnit in Section 1.4.3.
As another example, consider the rule defining qualiedIdentier:
(3.18)
qualiedIdentier ::= <identifier> f .<identifier> g
As we saw above, parsing a qualiedIdentier such as java.lang.Class is straightfor-
ward:
1. One looks at the next incoming token and if it is an identifier, scans it. If it is not an
identifier, then one raises an error.
2. Repeatedly, as long as the next incoming token is a period:
a. One scans the period.
b. One looks at the next incoming token and if it is an identifier, scans it. Otherwise,
one raises an error.
The method for implementing this not only parses the input, but also constructs an
AST node for recording the fully qualified name as a string. The method is quite simple but
introduces two helper methods: have() and mustBe() , which we use frequently throughout
the parser.
privateTypeNamequalifiedIdentifier(){
intline=scanner.token().line();
mustBe(IDENTIFIER);
StringqualifiedIdentifier=scanner.previousToken().image();
while(have(DOT)){
mustBe(IDENTIFIER);
qualifiedIdentifier+="."
+scanner.previousToken().image();
}
returnnewTypeName(line,qualifiedIdentifier);
}
The method have() is a predicate. It looks at the next incoming token (supplied by the
Scanner ), and if that token matches its argument, then it scans the token in the input and
returns true . Otherwise, it scans nothing and returns false .
The method mustBe() requires that the next incoming token match its argument. It
looks at the next token and if it matches its argument, it scans that token in the input.
Otherwise, it raises an error 3 .
As another example, consider our syntax for statements.
statement ::= block
j <identifier>: statement
j if parExpression statement [ else statement]
j while parExpression statement
j return [expression] ;
j ;
j statementExpression ;
(3.19)
3 As we shall see below, mustBe() provides a certain amount of error recovery.
 
Search WWH ::




Custom Search