Java Reference
In-Depth Information
13.5. Working with
UTF
-16
In "
Working with UTF-16
" on page
196
, we described a number of utility
methods provided by the
Character
class to ease working with the sup-
plementary Unicode characters (those greater in value than
0xFFFF
that
require encoding as a pair of
char
values in a
CharSequence
). Each of the
String
,
StringBuilder
, and
StringBuffer
classes provides these methods:
public int
codePointAt(int index)
Returns the code point defined at the given index in
this
, tak-
ing into account that it may be a supplementary character
represented
by
the
pair
and
this.charAt(index)
this.charAt(index+1)
.
public int
codePointBefore(int index)
Returns the code point defined at the given index in
this
, tak-
ing into account that it may be a supplementary character
represented
by
the
pair
and
this.charAt(index-2)
this.charAt(index-1)
.
public int
codePointCount(int start, int end)
Returns the number of code points defined in
this.charAt(start)
to
this.charAt(end)
, taking into account sur-
rogate pairs. Any unpaired surrogate values count as one code
point each.
public int
offsetByCodePoints(int index, int numberOfCodePoints)
Returns the index into
this
that is
numberOfCodePoints
away
from
index
, taking into account surrogate pairs.