Java Reference
In-Depth Information
TOKEN DESCRIPTION
Words Any sequence of letters or digits 0 to 9 beginning with a letter. A letter is defined as any of A to Z and a to
z or \u00A0 to \u00FF. A word follows a whitespace character and is terminated by another whitespace
character, or any character other than a letter or a digit.
Comments Any sequence of characters beginning with a forward slash, /, and ending with the end-of-line character.
Comments are ignored and not returned by the tokenizer.
WhitespaceAll byte values from '\u0000' to '\u0020', which includes space, backspace, horizontal tab, vertical tab,
line feed, form feed, and carriage return. Whitespace acts as a delimiter between tokens and is ignored (ex-
cept within a quoted string).
To retrieve a token from the stream, you call the
nextToken()
method for the
StreamTokenizer
object:
int tokenType = 0;
try {
while((tokenType = tokenizer.nextToken()) != tokenizer.TT_EOF) {
// Do something with the token...
}
} catch (IOException e) {
e.printStackTrace(System.err);
System.exit(1);
}
The
nextToken()
method can throw an exception of type
IOException
, so we put the call in a
try
block.
The value returned depends on the token recognized, indicating its type, and from this value you can determ-
ine where to find the token itself. In the preceding fragment, you store the value returned in
tokenType
and
compare its value with the constant
TT_EOF
. This constant is a static field of type
int
in the
StreamToken-
izer
class that is returned by the
nextToken()
method when the end of the stream has been read. Thus the
while
loop continues until the end of the stream is reached. The token that was read from the stream is itself
stored in one of two instance variables of the
StreamTokenizer
object. If the data item is a number, it is
stored in a public data member
nval
, which is of type
double
. If the data item is a quoted string or a word, a
reference to a
String
object that encapsulates the data item is stored in the public data member
sval
, which
is of type
String
, of course. The analysis that segments the stream into tokens is fairly simple, and the way
in which an arbitrary stream is broken into tokens is illustrated in
Figure 8-6
.