Digital Signal Processing Reference
In-Depth Information
addresses, loops end-addresses and loops counters. The design of the machine is given in
Figure 10.20.
All the three LIFOs in the loopmachinework in lock step. The counter, when loaded with the loop
count of the current loop instruction, decrements its value after executing the end instruction of the
loop. The comparator sets the flag when the end of the loop is reached. Now based on the current
value of the counter the two flags are assigned values. The loop - end - flag is set if the value of the
counter is 0 and the machine is executing the last instruction of the loop. Similarly the loop - end -
instr - flag is set if the processor is executing the end instruction but the counter value is still not 0.
This signals the next-address logic to load the loop start-address as the next address to be executed.
The state machine branches back to execute the loop again.
10.7 Examples
10.7.1 Design for Motion Estimation
Motion estimation is the most computationally intensive component of video compression algo-
rithms and so is a good candidate for hardware mapping [4-6]. Although there are techniques for
searching the target block in a pre-stored reference frame, a full-search algorithm gives the best
result. This performs an exhaustive search for a target macro block in the entire reference frame to
find the best match. The architecture designed for motion estimation using a full motion search
algorithm can also be used for template matching in a machine vision application.
The algorithm works by dividing a video frame of size N h
M.
The motion estimation (ME) algorithm takes a target macro block in the current frame and searchers
for its closest match in the previous reference frame. A full-searchME algorithm searches the macro
block in the current framewith themacro block taking all candidate pixels of the previous frame. The
algorithm computes the sum of absolute differences (SAD) as the matching criterion by implement-
ing the expression:
N v into macro blocks of size N
N 1
N 1
SAD ð i ; j Þ¼
0 S ð k þ i ; l þ j Þ R ð k ; l Þ
j
j
k ¼
0
l ¼
In this expression, R is the reference block in the current frame and S is the search area, which for
the full-search algorithm is the entire previous frame while ignoring the boarder pixels. The ME
algorithmcomputes theminimumvalue of SAD and stores the (i, j) as the point where the best match
is found. The algorithm requires four summations over i, j, k and l.
Figure 10.21 shows an example. This design consists of a data movement block, two 2-D register
files for storing the reference and the target blocks, and a controller that embeds a loop machine. The
controller synchronizes the working of all blocks. The 2-D register file for the target block is
extended in rows and columns making its size equal to (N
þ
1)
(N
þ
1)
1 registers, whereas the
size of the register file for the reference block is N N.
A full-search algorithm executes by rows and, after completing calculation for the current macro
block, the new target block is moved one pixel in a row. While working in a row, for each iteration of
the algorithm, N
1 previous columns are reused. The data movement block brings the new column
to the extended - register - column in the ref - blk - reg - file . The search algorithm works
in a daisy-chain and, after completing one row of operation, the data movement block starts in the
opposite direction and fetches a new row in the extended - register - row in the register file.
The daisy-chainworking of the algorithm for maximum reuse of data in a target block is illustrated in
Figure 10.22.
Search WWH ::




Custom Search