Image Processing Reference
In-Depth Information
A 4 =x 3 -x 4
C 4 = B 4
B 4 =-A 4 -A 5
D 4 =(S 2 -S 6 )C 4
E 4 =-D 64 -D 4
F 4 = E 4
S 5 X 5 = F 7 +F 4
A 5 =x 2 -x 5
B 5 =A 5 +A 6
C 5 = B 5
D 5 =S 4 C 5
E 5 =D 5
F 5 =E 5 +E 7
S 1 X 1 = F 5 +F 6
A 6 =x 1 -x 6
C 6 = B 6
B 6 =A 6 +A 7
D 6 =(S 2 +S 6 )C 6
E 6 =D 6 -D 64
F 6 = E 6
S 7 X 7 = F 5 -F 6
A 7 =x 0 -x 7
B 7 = A 7
C 7 = B 7
E 7 =D 7
S 3 X 3 = F 7 -F 4
D 7 = C 7
F 7 =E 7 -E 5
D 64 = S 6 (C 6 +C 4 )
Fig. 5. The AAN algorithm limited to indices4-7only with a time-oriented structure.
Adders, sub-tractors, multipliers and shift registers are marked by the following colours:
blue, gray, black and green, respectively. Red colour corresponds to routines requiring a
cascade processes.
A direct implementation of the pure AAN algorithm requires 7 pipeline stages, which utilize
additional resources of shift registers for synchronization for operations like: X(t+1) = X(t). In
a numerical calculation in processors data are simply waiting for a next performance cycle.
The D 64 block contains a cascade of the sum and the multiplication. An implementation of
the cascade in a single clock FPGA logic block significantly reduce a speed. Additionally,
the lpm_add_sub mega-function from the Altera ® library of parameterized modules (LPM)
does not support an inversion of a sum i.e. B 4 = (
.
These operations would have to be performed in a cascade way by an adder and a sign
inversion. Cascade operations performed in the same clock cycle significantly slow down
a global registered performance.
A 4 +
A 5 )
or E 4 = (
D 64 +
D 4 )
A 4 =x 3 -x 4
C 4 = B 4
B 4 =A 4 +A 5
D 4 =(S 2 -S 6 )C 4
E 4 =D 4 -D 64
S 5 X 5 = E 7 +E 4
A 5 =x 2 -x 5
B 5 =A 5 +A 6
C 5 = B 5
D 5 = S 4 C 5
E 5 =D 7 +D 5
S 1 X 1 = E 5 +E 6
A 6 =x 1 -x 6
C 6 = B 6
B 6 =A 6 +A 7
D 6 =(S 2 +S 6 )C 6
E 6 =D 6 -D 64
S 7 X 7 = E 5 -E 6
A 7 =x 0 -x 7
B 7 = A 7
C 7 = B 7
E 7 =D 7 -D 5
S 3 X 3 = E 7 -E 4
D 7
= C 7
C 64 =B 6 -B 4
D 64 = S 6 C 64
Fig. 6. Optimized AAN algorithm for indices4-7.Aredefinition and splitting of variables
allowed a reduction of the chain length.
A simple redefinition of nodes removes difficulties mentioned above. The B 4 node defined
as the sum of A 4,5 nodes requires a simple lpm_add_sub mega-function. The D 4 node with
currently inverted sign allows using lpm_add_sub in E 4 performing a subtraction. The D 64
node from Fig. 5 can be split into the subtraction C 64 and the multiplication D 64 in the next
clock cycle (Fig. 6).
Search WWH ::




Custom Search