Digital Signal Processing Reference
In-Depth Information
representation of the algorithm in any of these formats gives convenient visualization of algorithms
for architecture mapping.
For blocks that need to process data at a very high data rate, the fastest achievable clock frequency
is almost the same as the number of samples the block needs to process every second. For example,
a typical BPSK receiver needs four samples per symbol to decode bits; for processing 36.5Mbps of
data, the receiver must process 146 million samples every second. The block implementing the front
end of such a digital receiver is tasked to process even higher numbers of samples per symbol. In
these applications the designer aims to achieve a clock frequency that matches the sampling
frequency. This matching eases the digital design of the system, because the entire algorithm then
requires one-to-one mapping of algorithmic operations to HWoperators. This class of architecture
is called 'fully dedicated architecture' (FDA). The designer, after mapping each operation to a HW
operator, may need to appropriately place pipeline registers to bring the clock frequency equal to the
sampling frequency.
The chapter describes this one-to-onemapping and techniques and transformations for adding one
or multiple stages of pipelining for better timing. The scope of the design is limited to synchronous
systems where all changes to the design are mastered by either a global clock or multiple clocks.
The focus is to design digital logic at register transfer level (RTL). The chapter highlights that the
design at RTL should be visualized as a mix of combinational clouds and registers, and the developer
then optimizes the combinational clouds by using faster computational units to make the HW run at
the desired clock while constraining it to fit within a budgeted silicon area. With feedforward
algorithms the designer has the option to add pipeline registers in slower paths, whereas in feedback
designs the registers can be added only after applying certain mathematical transformations.
Although this chapter mentions pipelining, a detail treatment of pipelining and retiming are given
exclusive coverage in Chapter 7.
4.2 Discrete Real-time Systems
A discrete real-time system is constrained by the sampling rate of the input signal acquired from the
real world and the amount of processing the systemneeds to performon the acquired data in real time
to produce output samples at a specified rate. In a digital communication receiver, the real-time input
signal may be modulated voice, data or video and the output is the respective demodulated signal.
The analog signal is converted to a discrete time signal using an analog-to-digital (A/D) converter.
Inmany designs this real-time discrete signal is processed in fixed-size chunks. The time it takes to
acquire a chunk of data and the time required to process this chunk pose a hard constraint on the
design. The design must be fast enough to complete its processing before the next block of data is
ready for its turn for processing. The size of the block is also important in many applications as it
causes an inherent delay. A large block size increases the delay and memory requirements, whereas
a smaller block increases the block-related processing overhead. In many applications the minimum
block size is constrained by the selected algorithm.
In a communication transmitter, a real-time signal - usually voice or video - is digitized and then
processed by general-purpose processors (GPPs), ASICs or FPGAs, or any combination of these.
This processed discrete signal is converted back to an analog signal and transmitted on a wired or
wireless medium.
A signal processing system may be designed as single-rate or multiple-rate. In a single-rate
system, the numbers of samples per second at the input and output of the systemare the same, and the
number of samples per second does not changewhen the samplesmove fromone block to another for
processing. Communication systems are multi-rate systems: data is processed at different rates in
different blocks. For each block the number of samples per second is specified. Depending on
Search WWH ::




Custom Search