Java Reference
In-Depth Information
1byte to 3 bytes. If a byte begins with a 0 bit, then the lower 7 bits represent
one of the 128 ASCII characters. If the byte begins with the bits 110, then it is
the first of a 2-byte pair that represent the Unicode values for 128 to 2047. If any
byte begins with 1110, then it is the first of a 3-byte set that can hold any of the
other Unicode values.
Java typically runs on platforms that use 1-byte extended-ASCII-, encoded
characters. Therefore, text I/O with the local platform, or with other platforms
over the network, must convert between the encodings. As we mentioned in the
previous section, the original 1-byte streams were not convenient for this so the
Reader / Writer classes for 2-byte I/O were introduced.
The default encoding is typically ISO-Latin-1, but your program can find the
local encoding with a static method in System :
String local - encoding = System.getProperty ("file.encoding");
The encoding can be explicitly specified in some cases via the constructor, such
as in the following file output:
FileOutputStream out - file = new FileOutputStream
("Turkish.txt");
OutputStreamWriter file - writer = new OutputStreamWriter
(out - file, "8859 - 3");
A similar overloaded constructor is available for InputStreamReader . See
the topic by Harold [2] for more information about character encoding in Java.
If a character is not available on your keyboard, it can be specified in a Java
program by its Unicode value. This value is represented with four hexadecimal
numbers preceded by the ( u ) escape sequence. For example, the “o” character
is given by
\ u00E8 .
We note finally that even the 65 535 entries of the version of Unicode used by
Java are not enough to encompass all of the language characters and symbol sets
in the world. Therefore, Java will gradually make the transition to Unicode 4.0,
which uses 32 bits. This is a challenge for many reasons, including the fact that
the char primitive is only 16-bit. Java 5.0 has some tools for dealing with 32-bit
supplementary characters but we don't have space here to discuss them. We refer
the reader to the article by Lindenberg and Okutsu for further information on
32-bit character support in Java [6].
\ u00F6 and “e” by
9.8 Object I/O
So far we have seen that we can do I/O with primitive data types and with text,
which involves String objects. By means of the ObjectInputStream and
ObjectOutputStream you can also do I/O with other types of objects. The
writeObject (Object) method in ObjectOutputStream grabs the data
from the class fields of an existing object and sends that data through the stream.
Search WWH ::




Custom Search