Hardware Reference
In-Depth Information
B.1 PRINCIPLES OF FLOATING POINT
One way of separating the range from the precision is to express numbers in
the familiar scientific notation
n
10 e
where f is called the fraction ,or mantissa , and e is a positive or negative integer
called the exponent . The computer version of this notation is called floating
point . Some examples of numbers expressed in this form are
3.14
=
f
×
10 1
10 0
= 0.314
×
= 3.14
×
10 −5
10 −6
0.000001 = 0.1
×
= 1.0
×
10 3
The range is effectively determined by the number of digits in the exponent and the
precision is determined by the number of digits in the fraction. Because there is
more than one way to represent a given number, one form is usually chosen as the
standard. In order to investigate the properties of this method of representing num-
bers, consider a representation, R , with a signed three-digit fraction in the range
0. 1
10 4
1941
= 0.1941
×
= 1.941
×
f < 1 or zero and a signed two-digit exponent. These numbers range in
magnitude from
10 +99 , a span of nearly 199 orders of
magnitude, yet only five digits and two signs are needed to store a number.
Floating-point numbers can be used to model the real-number system of math-
ematics, although there are some differences. Figure B-1 gives an exaggerated
schematic of the real number line. The real line is divided up into seven regions:
1. Large negative numbers less than
10 −99
+
0. 100
×
to
+
0. 999
×
10 99 .
0. 999
×
10 99 and
10 −99 .
2. Negative numbers between
0. 999
×
0. 100
×
10 −99 .
3. Small negative numbers with magnitudes less than 0. 100
×
4. Zero.
5. Small positive numbers with magnitudes less than 0. 100
10 −99 .
×
10 −99 and 0. 999
10 99 .
6. Positive numbers between 0. 100
×
×
10 99 .
7. Large positive numbers greater than 0. 999
×
One major difference between the set of numbers representable with three frac-
tion and two exponent digits and the real numbers is that the former cannot be used
to express any numbers in regions 1, 3, 5, or 7. If the result of an arithmetic opera-
tion yields a number in regions 1 or 7—for example, 10 60
10 120 over-
flow error will occur and the answer will be incorrect. The reason for this is due
to the finite nature of the representation for numbers and is thus unavoidable. Sim-
ilarly, a result in regions 3 or 5 cannot be expressed either. This situation is called
10 60
×
=
 
 
Search WWH ::




Custom Search