Digital Signal Processing Reference
In-Depth Information
It can be shown for negative fraction x in 1's complement notation that the
truncation error is always positive and spread over the range
0 e qt ð 2 b 2 k Þ:
ð 4 : 8 Þ
The same truncation error is found in the sign-magnitude notation as the
magnitude of the truncated number Q[x] is smaller than that of the original neg-
ative number x. The truncation error of the 1's complement and sign-magnitude
notations is shown in Fig. 4.3 b.
However, negative numbers represented in 2's complement notation has dif-
ferent truncation error as it is always negative. It can be shown that the error range
in this case is
ð 2 b 2 k Þ e qt 0 ;
ð 4 : 9 Þ
as shown in Fig. 4.3 c.
4.3.2 Quantization of Floating-Point Numbers
In fixed-point variables, the increment between adjacent numbers is always the
same. In floating-point format, on the other hand, the increment between adjacent
numbers varies considerably over the allowable number range. Therefore, it is
much more informative to consider what is known as the relative quantization
error e f . Considering the floating-point format in Fig. 4.2 , it is apparent that the
quantization error is given by Q(x) = 2 E Q(M). The relative quantization error, e f ,
is defined as
e f ¼ Q ð M Þ M
M
:
ð 4 : 10 Þ
It can be shown that the relative rounding error e fr of a floating-point number
(regardless of whether one uses 1's complement, 2's complement, or sign-mag-
nitude format) has the range
D\e fr D ;
ð 4 : 11 Þ
for all positive and negative numbers. On the other hand, the relative truncation
error using the 1's complement and sign-magnitude conventions is given by
2D\e fr 0 ;
ð 4 : 12 Þ
for all positive and negative numbers. Finally, the relative truncation error for 2's
complement convention is
2D\e fr 0 ;
for
x [ 0
e fr ¼
ð 4 : 13 Þ
0 e fr \2D ;
for
x\0
For more details, see Porat [ 3 ].
Search WWH ::




Custom Search