Architectures for Stereo Vision - Signal Processing Systems - page 496

Digital Signal Processing Reference

In-Depth Information

a

b

Fig. 16 Architectural extension ( b ) of the 2D-systolic array ( a ) for introducing disparity level

parallelism where d m specifies the number of disparity levels processed in parallel

In order to introduce disparity level parallelism in addition to the row level

parallelism, the C-PEs and L-PEs are extended to process several consecutive

disparity levels in parallel. These groups of parallel disparity levels are processed

serially. This leads to an approximately linear increase in throughput. Further, it

is area efficient for two reasons. First, additional logic is only required for parts

of the processing units. And second, the absolute size of local buffers does not

change—only the depth-to-width ratio. This is a major advantage of disparity level

parallelism. The architectural extension for disparity level parallelism is shown in

Fig. 16 .

Boundary treatment for pixels with missing stereo overlap (i.e. x

<

d max )

(

,

)

(

,

)

significantly reduces the number of entries of the cost spaces C

p

d

, L r

p

d

,

(

,

)

S

, and, consequently, leads to a computing time reduction. For VGA images

and a disparity range of 128 px the reduction is 9

p

d

.

9% (without disparity level

parallelism).

An external interim memory is required for storing the path costs of the three non-

horizontal paths of the last row of an image slice and providing them to the first row

of the consecutive image slice. Due to the extremely regular data transfer, obeying

the FIFO-principle, and the low transfer rates, external SSRAM and SDRAM-

memories can be used. Alternatively, on-chip memory can be considered due to

the quite low absolute memory requirements.

3.7.3

Performance

Performance of the complete system and scalability of the SGM unit are analyzed

with the minimum clock frequency required to fulfill a fixed throughput constraint.

This metric, i.e. the clock frequency normalized for a fixed throughput, allows direct

and accurate comparison, and reflects the importance of performance while being

independent from varying operating clock frequencies [ 70 ] . This also models a

typical design constraint of real-world applications, where the required throughput

is usually specified by external circumstances (e.g. by the cameras, required depth

resolution, etc.). In this case, throughput-normalized metrics for clock frequency,

Next Page

Signal Processing Systems

Search WWH ::

Custom Search

Home