Java Reference
In-Depth Information
The tokenizer is a "greedy" tokenizer. It grabs as many characters as it
can to build up the next token, not caring if this creates an invalid se-
quence of tokens. So because
++
is longer than
+
, the expression
j = i+++++i; // INVALID
is interpreted as the invalid expression
j = i++ ++ +i; // INVALID
instead of the valid
j = i++ + ++i;
7.1.4. Identifiers
Identifiers,
used for names of declared entities such as variables, con-
stants, and labels, must start with a letter, followed by letters, digits, or
both. The terms
letter
and
digit
are broad in Unicode: If something is
considered a letter or digit in a human language, you can probably use
it in identifiers. "Letters" can come from Armenian, Korean, Gurmukhi,
Georgian, Devanagari, and almost any other script written in the world
today. Thus, not only is
kitty
a valid identifier, but , , ,
bol (such as
$
,
¥
, and
£
) and connecting punctuation (such as
_
).
[4]
These are the word "cat" or "kitty" in English, Serbo-Croatian, Russian, Persian, Tamil, and Japan-
ese, respectively.
Any difference in characters within an identifier makes that identifier
unique. Case is significant:
A
,
a
,
á
,
À
,
Å
, and so on are different identi-
fiers. Characters that look the same, or nearly the same, can be con-