Hardware Reference
In-Depth Information
Example 1: Exponentiation to the base 2
2 -2
2 -4
2 -6
2 -8
2 -10
2 -11
2 -12
2 -13
2 -14
2 -15
2 -16
2 -1
2 -3
2 -5
2 -7
2 -9
=2 20 (1
2 -12 +1
2 -13 +1
2 -15
Unnormalized:
.
×
×
×
0
111
0000
0
0000000000110
11
2 -16 ) = 432
+1
×
2 -13
+1 × 2 -15 +1 × 2 -16
2 -12 +1
Sign
+
Excess 64
exponent is
84-64=20
Fraction is 1
×
×
To normalize, shift the fraction left 11 bits and subtract 11 from the exponent.
Normalized:
.
=2 9 (1 × 2 -1 +1 × 2 -2 +1 × 2 -4
+1 × 2 -5 ) = 432
0
100
0101
1
1011000000000
00
Fraction is 1 × 2 -1 +1 × 2 -2
+1
Sign
+
Excess 64
exponent is
73-64=9
2 -4 +1
2 -5
×
×
Example 2: Exponentiation to the base 16
16 -1
16 -2
16 -3
16 -4
.
=16 5 (1 × 16 -3 +B × 16 -4 ) = 432
Unnormalized: 0101
0001 000
0
00 0
0
00 1
0
10 1
1
16 -3 +B
16 -4
Sign
+
Excess 64
exponent is
69-64=5
Fraction is 1
×
×
To normalize, shift the fraction left 2 hexadecimal digits, and subtract 2 from the exponent.
=16 3 (1
16 -1 +B
16 -2 ) = 432
Normalized:
0
100
0011
.
0
001
1011
0000
0000
×
×
16 -1 +B
16 -2
Sign
+
Excess 64
exponent is
67-64=3
Fraction is 1
×
×
Figure B-3. Examples of normalized floating-point numbers.
To rectify this situation, in the late 1970s IEEE set up a committee to stan-
dardize floating-point arithmetic. The goal was not only to permit floating-point
data to be exchanged among different computers but also to provide hardware de-
signers with a model known to be correct. The resulting work led to IEEE Stan-
dard 754 (IEEE, 1985). Most CPUs these days (including the Intel and JVM ones
studied in this topic) have floating-point instructions that conform to the IEEE
floating-point standard. Unlike many standards, which tend to be wishy-washy
compromises that please no one, this one is not bad, in large part because it was
primarily the work of one person, Berkeley math professor William Kahan. The
standard will be described in the remainder of this section.
The standard defines three formats: single precision (32 bits), double precision
(64 bits), and extended precision (80 bits). The extended-precision format is in-
tended to reduce roundoff errors. It is used primarily inside floating-point arith-
metic units, so we will not discuss it further. Both the single- and double-precision
formats use radix 2 for fractions and excess notation for exponents. The formats
are shown in Fig. B-4.
Both formats start with a sign bit for the number as a whole, 0 being positive
and 1 being negative. Next comes the exponent, using excess 127 for single
 
Search WWH ::




Custom Search