Java Reference
In-Depth Information
Characterdata
I/Ousuallyrelatestotext,andtextconsistsofcharacters.Sincedatais
storedincomputermemoryasbinaryvalues,charactersarerepresentedbya
conventionalnumericencoding.InChapter3,yousawthattheASCIIencod-
ingtheletterAismappedtothenumber65,theletterBtothenumber66,the
spacetothenumber32,andthedigit1tothenumber49(see Figure3-3 ).
Code must keep track of whether a stored value represents a binary
number,aportionofabinarynumber,oranalphanumericcharacter.Some
I/O devices are designed to assume that data always represents some spe-
cific character encoding. For example, when we send the value 66 to the
consoledevice,itknowstolookupabitmapfortheletterBanddisplaysit
onthescreen.
For many years computer technology assumed that character data con-
sisted of the ten decimal digits, the upper- and lower-case letters of the
English alphabet, and a fewdozen additional symbols such as punctuation
marks. Some systems later added a fewother characters that were neces-
sary in the western European languages and in mathematical expressions.
However, these character sets do not allow representing characters in
Arabic,Japanese,Chinese,Russian,Greek,andmanyotherlanguages.
Java was conceived as a universal language. It supports dozens of char-
actersets,includingASCII,ISOLatin-1,andUnicode.
ThesimplestandmostlimitedJavacharactersetisdefinedbytheAmer-
ican Standard Code for Information Interchange, or ASCII, discussed in
Chapters3 and 4 .Thissetcontains128charactersintherange0to127.
Someofthesearecontrolcodes;forexample,thevalue10isinterpretedas
a linefeed, the value 13 as a carriage return, and the value 8 as a tabulation
code. The digits 0 to 9 are represented by the values 48 to 57. The up-
per-caselettersAthroughZoftheEnglishalphabetareencodedintheval-
ues 65 to 90. The lower-case letters are the values 97 to 122. The value 32
represents a space. The remaining values are used for symbols, such as
!"#$%&'()*+,-./:;?{|}and~.
On the Web
The program AsciiSet.java, in Chapter 19 folder at
www.crcpress.com , displaysthecharactersintheASCIIset.
A second character set supported by Java is defined by the Interna-
tional Standards Institute Latin-1 standard, commonly referred to as ISO
Search WWH ::




Custom Search