Digital Signal Processing Reference
In-Depth Information
to continue to grow. Since they share some basic common features such as program memory and data
memory with associated address buses, arithmetic logic units (ALUs), program control units, MACs,
shift units, and address generators, here we focus on an overview of the TMS320C54x processor. The
typical TMS320C54x fixed-point DSP architecture appears in Figure 9.13 .
The fixed-point TMS320C50 families supporting 16-bit data have on-chip program memory and
data memory in various sizes and configurations. They include data RAM (random access memory)
and program ROM (read-only memory) used for program code, instruction, and data. Four data buses
and four address buses are accommodated to work with the data memory and program memory. The
program memory address bus and program memory data bus are responsible for fetching program
instructions. As shown in Figure 9.13 , the C and D data memory address buses and the C and D data
memory data buses deal with fetching data from the data memory while the E data memory address bus
and E data memory data bus are dedicated to moving data into data memory. In addition, the E memory
data bus can access the I/O devices.
Computational units consist of an ALU, a MAC, and a shift unit. For the TMS320C54x family, the
ALU can fetch data from the C, D, and program memory data buses and access the E memory data bus.
It has two independent 40-bit accumulators, which are able to operate 40-bit addition. The multiplier,
which can fetch data from the C and D memory data buses and write data via the E data memory data
bus, is capable of operating 17-bit 17-bit multiplications. The 40-bit shifter has the same capability
of bus access as the MAC, allowing all possible shifts for scaling and fractional arithmetic such as
those we have discussed for the Q-format.
The program control unit fetches instructions via the program memory data bus. Again, in order to
speed up memory access, there are two address generators available: one responsible for program
addresses and one for data addresses.
Advanced Harvard architecture is employed, where several instructions operate at the same time
for given a given single instruction cycle. Processing performance offers 40 MIPS (million instruction
sets per second). To further explore this subject, the reader is referred to Dahnoun (2000), Embree
(1995), Ifeachor and Jervis (2002), and Van der Vegte (2002), as well as the TI website ( www.ti.com ) .
9.4.6 Floating-Point Processors
Floating-point DS processors perform DSP operations using floating-point arithmetic, as we discussed
before. The advantages of using the floating-point processor include getting rid of finite word length
effects such as overflows, round-off errors, truncation errors, and coefficient quantization error. Hence,
in terms of coding, we do not need to scale input samples to avoid overflow, shift the accumulator
result to fit the DAC word size, scale the filter coefficients, or apply Q-format arithmetic. A floating-
point DS processor with high speed and calculation precision facilitates a friendly environment to
develop and implement DSP algorithms.
Analog Devices provides floating-point DSP families such as ADSP210xx and TigerSHARC.
Texas Instruments offers a wide range of the floating-point DSP families, in which the TMS320C3x is
the first generation, followed by the TMSC320C4x and TMS320C67x families. Since the first
generation of a floating-point DS processor is less complicated than later generations but still has the
common basic features, we review the first-generation architecture first.
Figure 9.14 shows the typical architecture of Texas Instruments' TMS320C3x family of processors.
We discuss some key features briefly. Further detail can be found in the TMS320C3x User's Guide
Search WWH ::




Custom Search