Dense Correspondence and Its Applications - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

CT 1 = 000 1 110 0

100

110

Figure 5.15. The census transform of example

3 blocks. The bit string is computed clockwise

starting from the upper left corner, recording a

1 if the pixel is brighter than the center pixel

and 0 if it is darker. The distance between blocks

is computed as the Hamming distance between

the corresponding bit strings (in this case, 2).

120 150

C CT = 2

115

110

105

CT 2 = 000 0 110 1

130 170

Zabih and Woodfill [ 567 ] proposed an alternate block-matching scheme based

on the census transform , which turns a block into a binary bit string that defines

whether each pixel in the block is brighter (1) or darker (0) than the pixel at the

center. Figure 5.15 illustrates the idea with a 3

3 block. If CT i

(

x , y

)

denotes the

in image I i into a length N 2

census transformation of a N

N block around

(

x , y

)

−

bit string, then

C CT

(

x 0 , y 0 , d

) =

(

CT 2

(

−

d , y

) =

CT 1

(

x , y

))

(5.48)

(

x , y

) ∈ W

That is, we sum the Hamming distances between corresponding bit strings over the

window to arrive at the final cost. Hirschmüller and Scharstein [ 200 ] investigated

a large set of proposed stereo matching costs and concluded that methods based

on the census transform performed extremely well and were robust to photometric

differences between images.

Block-matching stereomethods in which each pixel independently determines its

disparity — known as winner-take-all approaches — are clearly suboptimal com-

pared to methods that simultaneously determine all the disparities according to

some global criterion. Using overlapping blocks implicitly encourages some degree

of coherence between disparities at neighboring pixels (or smoothness, in the ter-

minology of optical flow). However, disparity maps determined in this way typically

exhibit artifacts both within and across scanlines and don't produce high-quality

dense correspondence. These artifacts include poor performance in flat, constant-

intensity image regions where local methods have no way to determine the correct

match due to the aperture problem, as well as near object boundaries where blocks

overlap regions with significantly different disparities. In addition, multiple matches

may occur between different pixels in I 1 and the same pixel in I 2 .

A better idea is an algorithm that enforces global optimality of the estimated dis-

parity along each pair of scanlines. One of the earliest approaches, proposed by Ohta

and Kanade [ 352 ], used dynamic programming to find the globally optimal cor-

respondence between a scanline pair, using detected edges in each scanline as a

guide to build a piecewise-linear disparity map. Figure 5.16 illustrates the idea. The

disparities d

(

x , y

)

for an entire row (i.e., fixed y ) are selected to minimize

(

x , y , d

)

(5.49)

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home