Java Reference
In-Depth Information
System.out.println("Hello,World!");
}
}
The scanner breaks the program text into atomic tokens. For example, it recognizes each
of
import
,
java
,
.
,
lang
,
.
,
System
, and
;
as being distinct tokens.
Some tokens, such as
java
,
HelloWorld
, and
main
, are identifiers. The scanner catego-
rizes theses tokens as
IDENTIFIER
tokens. The parser uses these category names to identify
the kinds of incoming tokens.
IDENTIFIER
tokens carry along their images as attributes; for
example, the first
IDENTIFIER
in the above program has
java
as its image. Such attributes
are used in semantic analysis.
Some tokens are reserved words, each having its unique name in the code. For example,
import
,
public
, and
class
are reserved word tokens having the names
IMPORT
,
PUBLIC
,
and
CLASS
. Operators and separators also have distinct names. For example, the separators
.
,
;
,
{
,
}
,
[
and
]
have the token names
DOT
,
SEMI
,
LCURLY
,
RCURLY
,
LBRACK
, and
RBRACK
,
respectively.
Others are literals; for example, the string literal
Hello,World!
comprises a single
token. The scanner calls this a
STRING_LITERAL
.
Comments are scanned and ignored altogether. As important as some comments are to
a person who is trying to understand a program
8
, they are irrelevant to the compiler.
The scanner does not first break down the input program text into a sequence of tokens.
Rather, it scans each token on demand; each time the parser needs a subsequent token, it
sends the
nextToken()
message to the scanner, which then returns the token id and any
image information.
The scanner is discussed in greater detail in Chapter 2.
1.4.3 Parser
The parsing of a j-- program and the construction of its abstract syntax tree (AST) is
driven by the language's syntax, and so is said to be syntax directed. In the rst instance,
our parser is hand-crafted from the j-- grammar, to parse j-- programs by a technique known
as recursive descent.
For example, consider the following grammatical rule describing the syntax for a com-
pilation unit:
compilationUnit ::= [
package
qualiedIdentier
;
]
f
import
qualiedIdentier
;
g
ftypeDeclarationg
EOF
This rule says that a compilation unit consists of
An optional package clause (the brackets
[]
bracket optional clauses),
Followed by zero or more import statements (the curly brackets
{}
bracket clauses
that may appear zero or more times),
Followed by zero or more type declarations (in j--, these are only class declarations),
Followed by an end of file (
EOF
).
8
But we know some who swear by the habit of stripping out all comments before reading a program
for fear that those comments might be misleading. When programmers modify code, they often forget to
update the accompanying comments.
Search WWH ::
Custom Search