Java Reference
In-Depth Information
System.out.println("Hello,World!");
}
}
The scanner breaks the program text into atomic tokens. For example, it recognizes each
of import , java , . , lang , . , System , and ; as being distinct tokens.
Some tokens, such as java , HelloWorld , and main , are identifiers. The scanner catego-
rizes theses tokens as IDENTIFIER tokens. The parser uses these category names to identify
the kinds of incoming tokens. IDENTIFIER tokens carry along their images as attributes; for
example, the first IDENTIFIER in the above program has java as its image. Such attributes
are used in semantic analysis.
Some tokens are reserved words, each having its unique name in the code. For example,
import , public , and class are reserved word tokens having the names IMPORT , PUBLIC ,
and CLASS . Operators and separators also have distinct names. For example, the separators
. , ; , { , } , [ and ] have the token names DOT , SEMI , LCURLY , RCURLY , LBRACK , and RBRACK ,
respectively.
Others are literals; for example, the string literal Hello,World! comprises a single
token. The scanner calls this a STRING_LITERAL .
Comments are scanned and ignored altogether. As important as some comments are to
a person who is trying to understand a program 8 , they are irrelevant to the compiler.
The scanner does not first break down the input program text into a sequence of tokens.
Rather, it scans each token on demand; each time the parser needs a subsequent token, it
sends the nextToken() message to the scanner, which then returns the token id and any
image information.
The scanner is discussed in greater detail in Chapter 2.
1.4.3 Parser
The parsing of a j-- program and the construction of its abstract syntax tree (AST) is
driven by the language's syntax, and so is said to be syntax directed. In the rst instance,
our parser is hand-crafted from the j-- grammar, to parse j-- programs by a technique known
as recursive descent.
For example, consider the following grammatical rule describing the syntax for a com-
pilation unit:
compilationUnit ::= [ package qualiedIdentier ; ]
f import qualiedIdentier ; g
ftypeDeclarationg EOF
This rule says that a compilation unit consists of
An optional package clause (the brackets [] bracket optional clauses),
Followed by zero or more import statements (the curly brackets {} bracket clauses
that may appear zero or more times),
Followed by zero or more type declarations (in j--, these are only class declarations),
Followed by an end of file ( EOF ).
8 But we know some who swear by the habit of stripping out all comments before reading a program
for fear that those comments might be misleading. When programmers modify code, they often forget to
update the accompanying comments.
 
Search WWH ::




Custom Search