Java Reference
In-Depth Information
The result is a sequence of line terminators and input characters, which are the terminal
symbols for the third step in the tokenization process.
3.5. Input Elements and Tokens
The input characters and line terminators that result from escape processing (§ 3.3 ) and then
input line recognition (§ 3.4 ) are reduced to a sequence of input elements .
Input:
InputElements opt Sub opt
InputElements:
InputElement
InputElements InputElement
InputElement:
WhiteSpace
Comment
Token
Token:
Identifier
Keyword
Literal
Separator
Operator
Sub:
the ASCII SUB character, also known as “control-Z”
Those input elements that are not white space (§ 3.6 ) or comments (§ 3.7 ) are tokens . The
tokens are the terminal symbols of the syntactic grammar (§ 2.3 ) .
White space (§ 3.6 ) and comments (§ 3.7 ) can serve to separate tokens that, if adjacent,
might be tokenized in another manner. For example, the ASCII characters - and = in the
input can form the operator token -= 3.12 ) only if there is no intervening white space or
comment.
As a special concession for compatibility with certain operating systems, the ASCII SUB
character ( \u001a , or control-Z) is ignored if it is the last character in the escaped input
stream.
Consider two tokens x and y in the resulting input stream. If x precedes y , then we say that
x is to the left of y and that y is to the right of x .
Search WWH ::




Custom Search