Digital Signal Processing Reference
In-Depth Information
gate arrays (FPGAs) that are rich in flip-flops. If the objective is to conserve the number of flip-flops
of a state register, a binary-coded state machine should be used.
The chapter gives special emphasis to RTL coding guidelines for state machine design and lists
RTL Verilog code for examples. The chapter then focuses on digital design for complex signal
processing applications that need a finite state machine to generate control signals for the datapath
and have algorithm-like functionality. The conventional bubble representation of a state machine is
described, but it is argued that this is not flexible enough for describing complex behavior in many
designs. Complex algorithms require gradual refinement and the bubble diagram representation is
not appropriate. The bubble diagram is also not algorithm-like, whereas inmany instances the digital
design methodology requires a representation that is better suited for an algorithm-like structure.
The algorithmic state machine (ASM) notation is explained. This is a flowchart-like graphical
notation to describe the cycle-by-cycle behavior of an algorithm. To demonstrate the differences, the
chapter represents in ASM notation the same examples that are described using a bubble diagram.
The methodology is illustrated by an example of a first-in first-out (FIFO).
9.2 Examples of Time-shared Architecture Design
To demonstrate the need for a scheduler or a controller in time-shared architectures, this section
first describes some simple applications that require mapping on time-shared HW resources. These
applications are represented by simple dataflow graphs (DFGs) and their mappings on time-shared
HW require simple schedulers. For complex applications, finding an optimal hardware and its
associated scheduler is an 'NP complete' problem, meaning that the computation of an optimal
solution cannot be guaranteed in measurable time. Smaller problems can be optimally solved using
integer programming (IP) techniques [1], but for larger problems near-optimal solutions are
generated using heuristics [2].
9.2.1 Bit-serial and Digit-serial Architectures
Bit-serial architecture works on a bit-by-bit basis [3, 4]. This is of special interest where the data is
input to the system on bit-by-bit basis on a serial interface. The interface gives the designer
motivation to design a bit-serial architecture. Bit-by-bit processing of data, serially received on
a serial interface, minimizes area and in many cases also reduces the complexity of the design [5],
as in this case the arrangement of bits in the form of words is not required for processing.
An extension to bit-serial is a digit-serial architecture where the architecture divides an N-bit
operand to P ΒΌ N/M-bit digits that are serially fed, and the entire datapath is P-bit wide, where
P should be an integer [6-8]. The choice of P depends on the throughput requirement on the
architecture, and it could be 1 to N bits wide.
It is pertinent to point out that, as a consequence of the increase in device densities, the area usually
is not a very stringent constraint, so the designer should not unnecessarily get into the complications
of bit-serial designs. Only designs that naturally suit bit-serial processing should be mapped on these
architectures. A good example of a bit-serial design is given in [5].
Example: Figure 9.1 shows the design of FDA, where we assume the sampling clock equals the
circuit clock. The design is pipelined to increase the throughput performance of the architecture.
A node-merging optimization technique is discussed in Chapter 5. The technique suggests the use of
CSA and compression trees to minimize the use of CPA.
Now assume that for the DFG in Figure 9.1(a) the sampling clock frequency f s is eight times
slower than the circuit clock frequency f c . This ratio implies the HW can be designed such that it
Search WWH ::




Custom Search