Java Reference
In-Depth Information
If unambiguous grammars are so important, why do we see so many languages having
this ambiguous conditional statement? All of j--, Java, C, and C ++ include precisely this
same ambiguous conditional construct. One could easily modify the syntax of conditionals
to remove the ambiguity. For example, the programming languages S-Algol and Algol-S
define conditionals so:
S ::= if E do S
j if E then S else S
j s
E ::= e
(3.14)
Here, the do indicates the simpler conditional (having no else clause) while the then
indicates the presence of an else clause. A simple solution but it does change the language.
But programmers have become both accustomed to (and fond of) the ambiguous condi-
tional. So both programmers and compiler writers have learned to live with it. Programming
language reference manuals include extra explanations, such as In the conditional statement,
the else clause goes with the nearest preceding if .
And compiler writers handle the rule as a special case in the compiler's parser, making
sure that an else is grouped along with the closest preceding if .
The j-- grammar (and the Java grammar) have another ambiguity, which is even more
dicult. Consider the problem of parsing the expression
x.y.z.w
Clearly, w is a field; if it were a method expression, then that would be evident in the syntax:
x.y.z.w()
But, what about x.y.z ? There are several possibilities depending on the types of the
names x , y , and z .
If x is the name of a local variable, it might refer to an object with a field y , referring
to another object with a field z , referring to yet another object having our field w . In
this case, the expression would be parsed as a cascade of field selection expressions.
Alternatively, x.y might be the name of a package in which the class z is defined.
In this case, we would parse the expression by first parsing x.y.z as a fully qualified
class and then parse w as a static field selection of that class.
Other possibilities, parsing various permutations of (possibly qualified) class names
and field selection operations, also exist.
The parser cannot determine just how the expression x.y.z is parsed because types
are not decided until after it has parsed the program and constructed its abstract syntax
tree (AST). So the parser represents x.y.z in the AST by an AmbiguousName node. Later
on, after type declarations have been processed, the compiler rewrites this node as the
appropriate sub-tree.
These two examples of ambiguity in j-- (and in Java) make the point that in compiling
everyday programming languages, one must deal with messiness. One relies on formal tech-
niques where practical, because formal techniques lead to more correct compilers. But once
in awhile, formality fails and one must deal with special cases.
In general, there is no algorithm that can tell us whether or not an arbitrary grammar
is ambiguous. That is, ambiguity is not decidable. But there are large classes of grammars
(and so programming languages) that both can be shown to be decidable and for which
ecient parsers can be automatically constructed. These classes of grammars are large
 
Search WWH ::




Custom Search