Java Reference
In-Depth Information
fromvariable names such as a and b. For languages of greater complexity,
the techniques presented in Chapter 3 automate much of this task.
2. The parser processes tokens produced by the scanner, determines the
syntactic validity of the token stream, and creates an abstract syntax
tree (AST) suitable for the compiler's subsequent activities. Given the
simplicity of ac, wewrite its parser ad hoc using the recursive-descent style
presented in Chapter 5. While such parsers work well in many cases,
Chapter 6 presents a more popular technique for generating parsers
automatically.
3. The AST created by the parsing task is next traversed to create a symbol
table . This table associates type and other contextual information with
variables used in an ac program. Most programming languages allow
the use of an unbounded number of variable names. Techniques for
processing symbols are discussed more generally in Chapter 8. This
task can be greatly simplified for ac, which allows the use of at most 23
variable names.
4. The AST is next traversed to perform semantic analysis .Forac,such
analysis is fairly minimal. For most programming languages, multiple
passes over the AST may be required to enforce programming language
rules that are di
cult to check in the parsing task. Semantic analysis
often decorates or transforms portions of an AST as the actual meaning
of such portions becomes more clear. For example, an AST node for the
+ operator may be replaced with the actual meaning of +,whichmay
mean floating point or integer addition.
5. Finally, the AST is traversed to generate a translation of the original
program. Necessities such as register allocation and opportunities for
program optimization may be implemented as phases that precede code
generation. For ac, translation is su
ciently simple to be accommodated
in a single code-generation pass.
2.4 Scanning
The scanner's job is to translate a stream of characters into a stream of tokens ,
where each token represents an instance of some terminal symbol. Rigorous
methods for automatically constructing scanners based on regular expressions
(such as those shown in Figure 2.3) are covered in Chapter 3. Here, the
job at hand is su
ciently simple to undertake manually. Figure 2.5 shows
pseudocode for a basic, ad hoc scanner that finds tokens for the ac language.
Each token found by the scanner has the following two components:
 
 
Search WWH ::




Custom Search