Introduction - Crafting a Compiler

Java Reference

In-Depth Information

Chapters 5 and 6. Parsers are typically driven by tables created from a CFGs

by a parser generator .

The parser verifies correct syntax. If a syntax error is found, it issues a

suitable error message. Also, it may be able to repair the error (to form a

syntactically valid program) or to recover from the error (to allow parsing to

be resumed). In many cases, syntactic error recovery or repair can be done

automatically by consulting structures created by a suitable parser generator.

As syntactic structure is recognized, the parser usually builds an AST as

a concise representation of program structure. The AST then serves as a basis

for semantic processing. ASTs are discussed in Chapters 2 and 7.

Thetypecheckerchecksthe static semantics of each AST node. That is, it

verifies that the construct the node represents is legal and meaningful (that

all identifiers involved are declared, that types are correct, and so on). If the

construct is semantically correct, the type checker decorates the AST node by

adding type information to it. If a semantic error is discovered, a suitable error

message is issued.

Type checking is purely dependent on the semantic rules of the source

language. It is independent of the compiler's target.

If an AST node is semantically correct, it can be translated into IR code that

correctly implements the meaning of the AST node. For example, an AST for

a while loop contains two subtrees, one representing the loop's expression and

the other representing the loop's body. However, nothing in the AST explicitly

captures the notion that a while loop loops! This meaning is captured when a

while loop's AST is translated to IR form. In the IR, the notion of testing the

value of the loop control expression and conditionally executing the loop body

is made explicit.

The translator is largely dictated by the semantics of the source language.

Little of the nature of the target machine needs to be made evident. As a

convenience during translation, some general aspects of the target machine

may be exploited (for example, that the machine is byte-addressable or that

it has a runtime stack). However, detailed information on the nature of the

target machine (operations available, addressing, register characteristics, etc.)

is reserved for the code-generation phase.

In simple, nonoptimizing compilers, the translator may generate target

code directly without using an explicit IR. This simplifies a compiler's design

by removing an entire phase. However, it also makes retargeting the compiler

Search WWH ::

Custom Search

Home