Arithmetic - Signal Processing Systems - page 586

Digital Signal Processing Reference

In-Depth Information

p(e )

p(e )

1

1

p(e )

Q

Q

1

2 Q

X > 0

X < 0

e

e

e

- Q

- Q

Q

- Q

2

Q

2

Truncation

Rounding

Magnitude Truncation

Fig. 6

Error distributions for fixed-point arithmetic

1.6.2

Truncation

Quantizing a binary number, X , with infinite word length to a number, X Q , with

finite word length yields an error

e

=

X Q −

X

.

(6)

Truncation of the binary number is performed by removing the bits with index

i

W f . The resulting error density distribution is shown in the center of Fig. 6 . The

variance is

>

Q 2

12 and the mean value is

2

σ

=

−

Q

/

2where Q refer to the weight of the

last bit position.

1.6.3

Rounding

Rounding is, in practice, performed by adding 2 − ( W f + 1 ) to the non-quantized num-

ber before truncation. Hence, the quantized number is the nearest approximation

to the original number. However, if the word length of X is W f +

1, the quantized

number should, in principle, be rounded upwards if the last bit is 1 and downwards

if it is 0, in order to make the mean error zero. This special case is often neglected

in practice. The resulting error density distribution is shown to the left in Fig. 6 .

The variance is

Q 2

12 and the mean value is zero.

2

σ

=

1.6.4

Magnitude Truncation

Magnitude truncation quantizes the number so that

|

X Q |≤|

|.

X

(7)

≤

≥

Hence, e is

0if X is negative. This operation can be

performed by adding 2 − ( W f + 1 ) before truncation if X is negative and 0 otherwise.

That is, in two's complement representation adding the sign bit to the last position.

The resulting error density distribution is shown to the right in Fig. 6 . The error

analysis of magnitude truncation becomes very complicated since the error and sign

of the signal are correlated [ 31 ] .

0if X is positive and

Next Page

Signal Processing Systems

Search WWH ::

Custom Search

Home