Hardware Reference
In-Depth Information
Instruction
RAM
Processor Controller
PE
PE
PE
PE
Data Registers
Data Registers
PE
PE
256w
256w
2-bit processing element
Fig. 1.9
Highly SIMD-type processor core
Image processing unit
Stream processing unit
Symbol
codec
TRF
(PIPE)
FME
(PIPE)
DEB
(PIPE)
#1
DMAC
CME
Stream processor
Shift-register-based bus
CABAC accelerator
Symbol
codec
TRF
(PIPE)
FME
(PIPE)
DEB
(PIPE)
#0
CME
L-MEM
PIPE: Programmable image processing element
TRF: Transformer, FME: Fine motion estimator/compensator, DEB: De-blocking filter
CME: Coarse motion estimator, L-MEM: Line memory
CABAC: Context-based Adaptive Binary Arithmetic Coding
Fig. 1.10
Full HD H.264 video CODEC accelerator
accelerator is highly optimized for the target applications. The full HD H.264 video
CODEC accelerator described in Sect. 3.4 is a good example [ 5 ] . The accelerator
(Fig. 1.10 ), which is fabricated using 65-nm CMOS technology and operates at
162 MHz, consists of dedicated processing elements, hardware logics, and proces-
sors which are suitably designed to execute each CODEC stage. The accelerator
decodes full HD (high definition) H.264 video at 172 mW. If we use a high-end
CPU core for this decoding, at least a 2-3 GHz frequency is necessary with the
100% load of the CPU. This means this CODEC core achieves 2-300 times higher
performance per watt than a high-end CPU core.
In our heterogeneous multicore approach, both general-purpose CPU cores and
special-purpose processor cores described above are used effectively. When a pro-
gram is executed, it is divided into small parts, and each part is executed in the most
suitable processor core. This should achieve a very power-efficient and cost-effective
Search WWH ::




Custom Search