Digital Signal Processing Reference
In-Depth Information
Tabl e 3 Accelerated algorithms for image and video signal processing
Algorithms
Kernel operations
SAD
Result=
|
Ai
Bi
|
Interpolation
Result=round (( a 1 x 1 ± a 2 x 2 ± a 3 x 3 ± a 4 x 4 ± a 5 x 5 ± a 6 x 6 ) / 32 )
De-blocking filter
Result=round (( a 1 x 1 + a 2 x 2 + a 3 x 3 + a 4 x 4 + a 5 x 5 + a 6 x 6 ) / 8 )
8 × 8 Discrete cosine transform
Result=Integer butterfly computing and data access
Color transform
Result= a 1 x 1 ± a 2 x 2 ± a 3 x 3
The classical MPEG2 (Moving Picture Expert Group) video compression stan-
dard can reach a 50:1 compression ratio on average. The advanced video codec
H.264/AVC standard can increase this ratio to more than 100:1. The computing cost
of a video encoder is very dependent on the complexity of the video stream and
the motion estimation algorithm. Including the cost of memory accesses, a H.264
encoder may consume as much as 4,000 operations for a pixel and the decoder
consumes about 500 operations for a pixel on average. As an example, for encoding
a video stream with QCIF size (176
144) and 30 frames per second, the encoder
requires about 3 Giga operations per second. The corresponding decoder requires
about 400 Mega operations per second. A general DSP processor may not offer
both the performance and low power consumption for such applications. An ASIP
will be needed especially for handheld video encoding.
Obviously, function acceleration is needed both for encoding and decoding
of images and video frames. The heaviest computing to accelerate is motion
estimation. Many motion estimation algorithms have been proposed, most of them
based on SAD (Sum of absolute difference). Some accelerated algorithms are listed
in Table 3 .
If data access and computing of the listed algorithms in Table 3 can be
implemented with accelerated instructions, more than 80% of image/video signal
processing can be accelerated.
×
7
Programming Toolchain and Benchmarking
As soon as an assembly instruction set is proposed, its toolchain should be
provided for benchmarking the designed instruction set. An assembly programming
toolchain includes the C-compiler, the assembler, the linker, the ISS (instruction
set simulator), and the debugger. A simplified firmware design flow is given in
the following Fig. 17 a . The toolchain to support programming is given in Fig. 17 b .
Seven main design steps for translating the behavior C code to the qualified
executable binary code are shown in Fig. 17 a , and the six tools in the programmer's
toolchain in Fig. 17 b are used for the code translation and simulation. Each tool
in the toolchain in Fig. 17 b is marked with numbers. The ways that tools are used
in each design step in the flow are annotated by numbers on the design steps in
Fig. 17 a .
 
 
Search WWH ::




Custom Search