Java Reference
In-Depth Information
from errors that cause a string to extend farther than was intended. A special
string token can be used to implement such warnings. A valid string token is
returned and an appropriate warning message is issued.
In languages such as C, C
, Java, and Pascal, which allow multiline
comments, improperly terminated (that is, runaway) comments present a sim-
ilar problem. A runaway comment is not detected until the scanner finds a
close comment symbol (possibly belonging to some other comment) or until
end-of-file is reached. Clearly, a special error message is required.
Consider the Pascal-style comments that begin with a
++
.
(Comments that begin and end with a pair of characters, such as /* and */ in
Java, C, and C
{
and end with a
}
, are a bit trickier to get right; see Exercise 6.)
Correct Pascal comments are defined quite simply:
++
) }
To handle comments terminated by Eof, the error token approach can be
used:
{
Not(
}
) Eof
To handle comments closed by a close comment belonging to another
comment (for example,
{
Not(
}
),
we issue a warning (but not an error message; this form of comment is lexically
legal). In particular, a comment containing an open comment symbol in its
body is most probably a symptom of the kind of omission depicted previously.
We therefore split the legal comment definition into two tokens. The one
that accepts an open comment in its body causes a warning message to be
printed ("Possible unclosed comment"). This results in the following token
definitions:
{
...missing close comment...
{
normal comment
}
matches correct comments that do not con-
tain an open comment in their bodies
) }
{
Not(
{|}
matches correct, but suspect, comments
that contain at least one open comment in
their bodies
) ) +
) {
{
(Not(
{|}
Not(
{|}
}
matches a runaway comment terminated
by end-of-file
) Eof
{
Not(
}
Single-line comments, found in Java and C
, are always terminated by an
end-of-line character and so do not fall prey to the runaway comment problem.
They do, however, require that each line of a multiline comment contain an
open comment marker. Note, too, that we mentioned previously that balanced
brackets cannot be correctly scanned using regular expressions and finite au-
tomata. A consequence of this limitation is that nested comments cannot be
properly scanned using conventional techniques. This limitation causes prob-
lems when we want comments to nest, particularly when we “comment-out”
a piece of code (which itself may well contain comments). Conditional compi-
lation constructs, such as#ifand#endifin C and C
++
++
, are designed to safely
disable the compilation of selected parts of a program.
 
Search WWH ::




Custom Search