General-Purpose DSP Processors - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

3.1.1

Multiplier and ALU

As traditional DSP processors are used for computing numerically intensive tasks

a fast multiplier is an essential unit in a DSP processor. Although multipliers are

included in general-purpose microprocessors, there is a major difference in multi-

pliers in fixed-point DSP processors. In general-purpose fixed-point computations,

integer data type is used and multipliers operate such that integer operands result in

an integer product ( b -bit operands produce b -bit product, i.e., the LSB of product

is saved). In DSP processors, however, fractional data type is exploited, which

implies that the LSB of the product is not sufficient, thus the product is obtained

at full precision; multiplication of b -bit operands results in 2 b -bit product (law of

conservation of bits). This indicates that the integer multipliers present in standard

microprocessors and microcontrollers is not well suited to signal processing. In

some DSP processors, multipliers may produce narrower results for obtaining

speed-up or smaller silicon area. Multiplier may also be pipelined implying latency,

i.e., the product may not be available for the next instruction.

Multiplication is also involved in one of the most characteristic operations

in DSP, multiply-accumulate and often DSP performance even characterized as

MAC/s. This measure is often practical as DSP algorithms are, in general, data-

independent and, therefore, deterministic in behavior. DSP processor may contain

an additional adder to be used in MAC operation and these resources form a MAC

unit. Processor may also contain parallel units to further boost the performance on

DSP applications.

In similar fashion as in general-purpose processors, DSP processors contain

arithmetic-logical unit, which performs the basic operations: addition, subtraction,

increment, negate, and, or, not, etc. Often the addition in MAC operation is carried

out in ALU. The ALU in DSP processor operates in a similar fashion as in general-

purpose processors but the arithmetic operations are carried out with extended word

width operands. This is due to the fact the consecutive MAC operations tend to

increase the word width of the result. In order to avoid overflow in accumulation,

additional guard bits are used, i.e., additional bits are used when performing the

accumulation and storing the accumulation results. In general, log 2 (

additional

bits are required for carrying out N additions without overflow. The more guard

bits, the more headroom against the overflow.

In early DSP processors, multiplier was a separate unit, which stored its result

in a specific “product” register as seen in Fig. 1 . Such an arrangement implies that

an additional instruction is needed to perform the accumulation and there is latency

of one instruction in MAC operation. A similar behavior can be seen in Freescale

DSP56300 illustrated in Fig. 2 .

In order to increase the performance for DSP applications, processor can contain

several multipliers. For example, TI TMS320C55x family of processors contain two

MAC units [ 39 ] . The unit can perform multiplication with 17-bit operands and a 40-

bit addition. TI TMS320C64x processors contain parallel functional units as shown

in Fig. 3 . Two of the eight function units can perform several type of multiplications:

32

N

)

×

32 multiplication, 16

×

16 multiplication, 16

×

32 multiplication, quad 8

×

8

Signal Processing Systems

Search WWH ::

Custom Search

Home