Digital Signal Processing Reference
In-Depth Information
A SAD w/o CA is also investigated on the Tilera TilePro 64 and on many-core CPUs
[ 75 ] . SAD (w/o CA) for a VLIW processor (Texas Instruments TMS320C6414T,
1
0 GHz) has been shown in [ 13 ] .
Application specific processors (ASIP) have been investigated in two cases:
For semi-global matching an instruction set extension for the Tensilica LX2 DSP
template has been proposed [ 6 ] reporting 20 fps for 640
.
64 images with
reduced number of paths when run at 373 MHz, which is possible with the targeted
TSMC 90 nm process. Similarly for SGM, architecture optimizations for a VLIW
processor template, the MOAI, have been investigated in [ 69 ] reaching 30 fps when
running at 400 MHz.
Apart from the original CPU implementation of SGM running at 1
×
480
×
.
3 s for 450
×
375
64 images [ 36 ] , a variant with depth adaptive sub-sampling has been proposed
running at 14 fps for 320
×
160 images [ 24 ] .
The cell broadband engine has been utilized for belief propagation and dynamic
programming, both taking few seconds to process an image pair [ 58 ] . An SAD (w/o
CA) implementation on the cell achieves 30 fps for VGA images with 48 disparity
levels.
×
3.4
Comparison Studies
In addition to the algorithmic studies mentioned earlier, studies also taking into
account the computational performance have been conducted. An evaluation of cost
aggregation for local methods with focus on algorithmic performance and run-time
on CPU can be found in [ 86 ] . Selected algorithms (various SAD variants, belief
propagation, and dynamic programing) have been compared on a CPU in [ 59 ] . An
evaluation of local algorithms on the GPU has been conducted in [ 29 , 57 ] .
An implementation of belief propagation on GPU and for VLSI has been
compared in [ 55 ] . And symmetric dynamic programing on GPU and FPGA has
been compared in [ 47 ] . Comparison of a census based approach (w/o CA) on a
DSP (TI C6416), a GPU (GeForce 9800 GT), and a CPU (Intel Core2Quad) has
been conducted in [ 42 ] .Andin[ 75 ] SAD (w/o CA) has been studied on a GPU,
two multi-core CPUs and the MIMD Tilea architecture. Further, in many of the
references in the previous sections the GPU or FPGA implementation is compared
to a regular CPU implementation. However, these are too numerous to list them here.
3.5
Current Trends
When targeting real-world applications, an everlasting question is to improve
algorithmic performance while reducing computational requirements. This has
already been addressed in many of the above references. A recent research direction
is to integrate the computation of various information retrieval image processing
Search WWH ::




Custom Search