Java Reference
In-Depth Information
You can also ask whether a
char
value
isLowSurrogate
or
isHighSurrogate
,
a specific
int
code point
isSupplementaryCodePoint
or
isValidCodePoint
, or
if a pair of
char
values
isSurrogatePair
.
Unicode identifiers are defined by the Unicode standard. Unicode iden-
tifiers must start with a letter (connecting punctuation such as
_
and
currency symbols such as
¥
are not letters in Unicode, although they
are in the Java programming language) and must contain only letters,
connecting punctuation (such as
_
), digits, numeric letters (such as Ro-
man numerals), combining marks, nonspacing marks, or ignorable con-
trol characters (such as text direction markers).
All these types of characters, and several others, are defined by the
Unicode standard. The static method
getType
returns an
int
that defines
a character's Unicode type. The return value is one of the following con-
stants:
COMBINING_SPACING_MARK
MODIFIER_LETTER
CONNECTOR_PUNCTUATION
MODIFIER_SYMBOL
CONTROL
NON_SPACING_MARK
CURRENCY_SYMBOL
OTHER_LETTER
DASH_PUNCTUATION
OTHER_NUMBER
DECIMAL_DIGIT_NUMBER
OTHER_PUNCTUATION