Input and Output - Java

Java Reference

In-Depth Information

Solution

Convert the text to or from internal Unicode by specifying a converter when you construct an

InputStreamReader or PrintWriter .

Discussion

Classes InputStreamReader and OutputStreamWriter are the bridge from byte-oriented

Stream s to character-based Reader s. These classes read or write bytes and translate them to

or from characters according to a specified character encoding. The UTF-16 character set

used inside Java ( char and String types) is a 16-bit character set. But most character

sets—such as ASCII, Swedish, Spanish, Greek, Turkish, and many others—use only a small

subset of that. In fact, many European language character sets fit nicely into 8-bit characters.

Even the larger character sets (script-based and pictographic languages) don't all use the

same bit values for each particular character. The encoding, then, is a mapping between Java

characters and an external storage format for characters drawn from a particular national or

linguistic character set.

To simplify matters, the InputStreamReader and OutputStreamWriter constructors are the

only places where you can specify the name of an encoding to be used in this translation. If

you do not specify an encoding, the platform's (or user's) default encoding is used.

PrintWriters , BufferedReaders , and the like all use whatever encoding the In-

putStreamReader or OutputStreamWriter class uses. Because these bridge classes only

accept Stream arguments in their constructors, the implication is that if you want to specify a

nondefault converter to read or write a file on disk, you must start by constructing not a

FileReader or FileWriter , but a FileInputStream or FileOutputStream !

// UseConverters.java

BufferedReader fromKanji = new BufferedReader(

new InputStreamReader(new FileInputStream("kanji.txt"), "EUC_JP"));

PrintWriter toSwedish = new PrinterWriter(

new OutputStreamWriter(new FileOutputStream("sverige.txt"), "Cp278"));

Not that it would necessarily make sense to read a single file from Kanji and output it in a

Swedish encoding; for one thing, most fonts would not have all the characters of both char-

acter sets, and, at any rate, the Swedish encoding certainly has far fewer characters in it than

the Kanji encoding. Besides, if that were all you wanted, you could use a JDK tool with the

ill-fitting name native2ascii (see its documentation for details). A list of the supported encod-

Java

Search WWH ::

Custom Search

Home