Architectures for Stereo Vision - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

[ 48 ] presented a multi-resolution symmetric dynamic programming variant on a

GTX 295 reaching 14 fps for 2048

256 images. A total variation algorithm

with GPU implementation has been presented requiring between 15 and 60 s per

image [ 73 ] .

Variants of local methods examining the different techniques of adaptive weights

or adaptive support regions have received much attention. Recent local approaches

are census based with basic box filter cost aggregation [ 92 ] and a local truncated

laplacian kernel approximation with adaptive cost aggregation [ 44 ] . Locally adap-

tive support regions have been used and speeded up with bitwise voting in [ 50 ] .

Further work on local variants with adaptive cost aggregation methods includes

[ 45 , 63 ] and[ 40 ] . Instead of adaptive support regions on the input images [ 61 ]

use edge-preserving filtering on the matching costs. A comparison of six local

methods in terms of algorithmic and computational performance on GPUs has been

conducted [ 29 ] . A plane sweep algorithm with local depth connectivity in order to

retain depth discontinuities has been examined in [ 16 ] .

For SGM various implementations have been presented on a GeForce 8800 Ultra

[ 19 ] (0

×

2048

×

128), a Quadro FX5600 [ 27 ] , a GTX 280 without

[ 31 ] and with increased depth accuracy [ 67 ] , and on a Tesla C2050 [ 4 ] , which is

the highest performing implementation with 63 fps for 640

.

0057 fps at 640

×

480

×

128 images. This

allows a very interesting retrospective on the evolution of GPUs. Especially some of

the new features of Nvidia's compute capability 2.0 graphics cards allow radically

different parallelization schemes, which was exploited in [ 4 ] . We will have a detailed

look at this implementation in Sect. 3.6 . Furthermore, a combination of adaptive

support regions with a reduced version of SGM is proposed in [ 62 ] reaching 10 fps

for 450

×

480

×

375

×

64 images.

3.2

Dedicated Architectures (FPGA and VLSI)

For dedicated architectures targeting FPGAs or ASICs, local methods are often

favored because of potentially very small designs. This goes as far as to omit the cost

aggregation altogether despite the drawbacks in accuracy and robustness. Neverthe-

less, new cost aggregation concepts have also been investigated and incorporated in

hardware. In the following implementations without cost aggregation are indicated

with “w/o CA”.

Some examples of early architectures using SAD based matching w/o CA are

[ 2 , 54 , 64 ] . An SAD based stereo vision system with three cameras has been

presented in [ 98 ] . Depending on the emphasis of the referenced work, the results

vary in throughput and resolution up to 640

64 and 31 fps. The so-called Tyzx

ASIC for color-image census-based stereo-matching (w/o CA) achieves 200 fps for

512

×

480

×

480 images and 52 disparity levels [ 93 ] . It forms the basis of an extended

stereo vision system in [ 94 ] .

Also for recent implementations local methods with and without cost aggregation

are still popular. This includes [ 46 ] where a census transform (w/o CA) is employed

×

Signal Processing Systems

Search WWH ::

Custom Search

Home