Java Reference
In-Depth Information
✓
✓
2.25
How would you write the following arithmetic expression?
Check
Point
b
2
-
b
+ 2
-
4
ac
2
a
A character data type represents a single character.
Key
Point
In addition to processing numeric values, you can process characters in Java. The character
data type,
char
, is used to represent a single character. A character literal is enclosed in sin-
gle quotation marks. Consider the following code:
char
type
char
letter =
'A'
;
char
numChar =
'4'
;
The first statement assigns character
A
to the
char
variable
letter
. The second statement
assigns digit character
4
to the
char
variable
numChar
.
Caution
A string literal must be enclosed in quotation marks (
" "
). A character literal is a single
character enclosed in single quotation marks (
' '
). Therefore,
"A"
is a string, but
'A'
is a character.
char
literal
2.17.1 Unicode and ASCII code
Computers use binary numbers internally. A character is stored in a computer as a sequence of
0s and 1s. Mapping a character to its binary representation is called
encoding.
There are differ-
ent ways to encode a character. How characters are encoded is defined by an
encoding scheme.
Java supports
Unicode,
an encoding scheme established by the Unicode Consortium to
support the interchange, processing, and display of written texts in the world's diverse lan-
guages. Unicode was originally designed as a 16-bit character encoding. The primitive data
type
char
was intended to take advantage of this design by providing a simple data type that
could hold any character. However, it turned out that the 65,536 characters possible in a 16-bit
encoding are not sufficient to represent all the characters in the world. The Unicode standard
therefore has been extended to allow up to 1,112,064 characters. Those characters that go
beyond the original 16-bit limit are called
supplementary characters.
Java supports the sup-
plementary characters. The processing and representing of supplementary characters are
beyond the scope of this topic. For simplicity, this topic considers only the original 16-bit
Unicode characters. These characters can be stored in a
char
type variable.
A 16-bit Unicode takes two bytes, preceded by
\u
, expressed in four hexadecimal digits
that run from
\u0000
to
\uFFFF
. Hexadecimal numbers are introduced in Appendix F, Num-
ber Systems. For example, the English word
welcome
is translated into Chinese using two
characters, . The Unicodes of these two characters are
\u6B22\u8FCE
.
Listing 2.9 gives a program that displays two Chinese characters and three Greek letters.
encoding
Unicode
original Unicode
supplementary Unicode
L
ISTING
2.9
DisplayUnicode.java
1
import
javax.swing.JOptionPane;
2
3
public class
DisplayUnicode {
4
public static void
main(String[] args) {
5 JOptionPane.showMessageDialog(
null
,
6
"\u6B22\u8FCE \u03b1 \u03b2 \u03b3"
,
7
"\u6B22\u8FCE Welcome"
,