Digital Signal Processing Reference
In-Depth Information
all paths from input to output intersect the line once. Figure 7.7(b) places pipeline registers along the
cut-set line. The following code implements this design:
// Module to implement a 4-bit 2-stage pipeline RCA
module pipeline_adder
(
input clk,
input [3:0] a, b,
input cin,
output reg [3:0] sum_p,
output reg cout_p);
// Pipeline registers
reg [3:2] a_preg, b_preg;
reg [1:0] s_preg;
reg c2_preg;
// Internal wires
reg [3:0] s;
reg c2;
// Combinational cloud
always @*
begin
// Combinatinal cloud 1
{c2, s[1:0]} = a[1:0] + b[1:0] + cin;
// Combinational cloud 2
{cout_p, s[3:2]} = a_preg + b_preg + c2_preg;
// Put the output together
sum_p = {s[3:2], s_preg};
end
// Sequential circuit: pipeline registers
always @(posedge clk)
begin
s_preg <= s[1:0];
a_preg <= a[3:2];
b_preg <= b[3:2];
c2_preg <= c2;
end
endmodule
The implementation adds two 4-bit operands a and b . The design has one set of pipeline registers
that divides the combinational cloud into two equal stages. In the first stage, two LSBs of a and b are
added with cin . The sum is stored in a 2-bit pipeline register s_preg . The carry out from the
addition is saved in pipeline register c2_preg . In the same cycle, the combinational cloud
simultaneously adds the already registered two MSBs of previous inputs a and b in 2-bit pipeline
registers a_preg and b_prag with carry out from the previous cycle stored in pipeline register
c2_preg . This 2-stage pipelineRCAhas two full adders in the critical path, and for a set of inputs the
corresponding sumand carry out is available after one clock cycle; that is, after a latency of one cycle.
Example: Let us extend the previous example. Consider that we need to reduce the critical path to
one full adder delay. This requires adding a register after the first, second and third FAs. We need to
Search WWH ::




Custom Search