Java Reference
In-Depth Information
Character sets and their encoding mechanisms are represented by spe-
cific classes within the
java.nio.charset
package:
Charset
A named mapping (such as
US-ASCII
or
UTF
-8) between se-
quences of 16-bit Unicode code units and sequences of bytes.
This contains general information on the sequence encoding,
simple mechanisms for encoding and decoding, and methods
to create
CharsetEncoder
and
CharsetDecoder
objects for richer
abilities.
CharsetEncoder
An object that can transform a sequence of 16-bit Unicode
code units into a sequence of bytes in a specific character set.
The encoder object also has methods to describe the encod-
ing.
CharsetDecoder
An object that can transform a sequence of bytes in a specific
character set into a sequence of 16-bit Unicode code units.
The decoder object also has methods to describe the decod-
ing.
You can obtain a
Charset
via its own static
forName
method, though usu-
ally you will just specify the character set name to some other method
(such as the
String
constructor or an I/O operation) rather than working
with the
Charset
object directly. To test whether a given character set is
supported use the
forName
method, and if you get an
UnsuppportedChar-
setException
then it is not.
You can find a list of available character sets from the static
avail-
ableCharsets
method, which returns a
SortedMap
of names and
Charset
instances, of all known character sets. For example, to print out the
names of all the known character sets you can use: