Stereo Vision Algorithms Suited to Constrained FPGA Cameras - Advances in Embedded Computer Vision

Graphics Reference

In-Depth Information

Fig. 5.3 Disparity Space Image, the 3D structure containing the pointwisematching cost C ( x , y , d )

for each point and for each disparity value within the disparity range D

scanlines in order to enforce smoothing constraints on the disparitymaps or bymeans

of other strategies.

Some algorithms, as will be discussed in the remainder, require to store in a mem-

ory structure the whole DSI depicted in Fig. 5.3 . Unfortunately, even with standard

image resolution and disparity range, this is a significant amount of data that typi-

cally exceeds the memory available in most current FPGAs and for this reason, an

external memory would be mandatory in these cases. For instance, by considering

images at 752

×

480, a disparity range of 64 and 16 bit for each matching cost

(

,

)

C

, the DSI consists of 44MB. Although a similar amount of data seems not

critical deploying external memory devices such as DDR memory or SRAM mem-

ory, there is a more critical constraint concerned with memory bandwidth. In fact,

FPGAs, despite their reduced clock frequency, compared to other parallel computing

devices such as GPUs, can be effective with respect to such devices by exploiting

their potential massive parallel capabilities by means of tailored internal logic recon-

figurations. Nevertheless, to this aim and in order to provide a throughput of (at least)

one disparity per clock cycle to keep pace with the pixels provided by the imaging

sensors, there is a strong memory pressure when intermediate results (for instance,

as typically occurs, the D values concerned with the point under examination or D

values for intermediate results (sometimes even k

x

y

d

D values) must be read within

a single clock cycle. This case is summarized in Fig. 5.4 .

For instance, by considering our previous configuration, D = 64 and size of each

matching cost C

×

2 bytes, with a pixel clock frequency of 30MHz (appropri-

ate for imaging sensors similar to those deployed in our camera), the memory band-

width required turns out to be higher than 3.5GB/s. In most cases, for each clock, this

amount of data must be read, processed/updated, and then written back to memory,

thus doubling the overall required memory bandwidth highlighted. Of course, with

higher resolution imaging sensors, typically clocked at higher frequency, moving

data back and forth between FPGA and memory further emphasizes the memory

bandwidth bottleneck.

(

x

,

y

,

d

)

Advances in Embedded Computer Vision

Search WWH ::

Custom Search

Home