Parsing - Introduction to Compiler Construction in a Java World - page 72

Java Reference

In-Depth Information

3.3.1 Parsing by Recursive Descent

Parsing by recursive descent involves writing a method (or procedure) for parsing each non-

terminal according to the production rules that define that non-terminal. Depending on

the next unscanned input symbol, the method decides which rule to apply and then scans

any terminals (tokens) in the right-hand side by calling upon the Scanner , and parses any

non-terminals by recursively invoking the methods that parse them.

This is a strategy we use in parsing j-- programs. We already saw the method

compilationUnit() that parses a j-- compilationUnit in Section 1.4.3.

As another example, consider the rule defining qualiedIdentier:

(3.18)

qualiedIdentier ::= <identifier> f .<identifier> g

As we saw above, parsing a qualiedIdentier such as java.lang.Class is straightfor-

ward:

1. One looks at the next incoming token and if it is an identifier, scans it. If it is not an

identifier, then one raises an error.

2. Repeatedly, as long as the next incoming token is a period:

a. One scans the period.

b. One looks at the next incoming token and if it is an identifier, scans it. Otherwise,

one raises an error.

The method for implementing this not only parses the input, but also constructs an

AST node for recording the fully qualified name as a string. The method is quite simple but

introduces two helper methods: have() and mustBe() , which we use frequently throughout

the parser.

privateTypeNamequalifiedIdentifier(){

intline=scanner.token().line();

mustBe(IDENTIFIER);

StringqualifiedIdentifier=scanner.previousToken().image();

while(have(DOT)){

mustBe(IDENTIFIER);

qualifiedIdentifier+="."

+scanner.previousToken().image();

}

returnnewTypeName(line,qualifiedIdentifier);

}

The method have() is a predicate. It looks at the next incoming token (supplied by the

Scanner ), and if that token matches its argument, then it scans the token in the input and

returns true . Otherwise, it scans nothing and returns false .

The method mustBe() requires that the next incoming token match its argument. It

looks at the next token and if it matches its argument, it scans that token in the input.

Otherwise, it raises an error 3 .

As another example, consider our syntax for statements.

statement ::= block

j <identifier>: statement

j if parExpression statement [ else statement]

j while parExpression statement

j return [expression] ;

j ;

j statementExpression ;

(3.19)

3 As we shall see below, mustBe() provides a certain amount of error recovery.

Next Page

Introduction to Compiler Construction in a Java World

Search WWH ::

Custom Search

Home