Stereo Correspondence in Information Retrieval - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

seek similar videos from a definite database, information retrieval systems have been

established with promising performance in searching accuracy and efficiency, e.g.

[1]. Many of these established systems attempt to search for videos that have been

annotated with metadata apriori (e.g. [2]). Nevertheless, there are still a significant

number of footages that have been recorded but not ever used [3]. These footages

normally have not been properly annotated, and hence the retrieval can only be

carried out according to the video contents rather than the annotated information.

Of the generic video contents, the need for the ability to retrieve 3-D models

from databases or the Internet has gained dramatic prominence. Content-based 3-D

model retrieval currently remains a hot research area, and has found its tremendous

applications in computer animation, medical imaging, and security. To effectively

extract a 3-D object, shape-based 3-D modelling (e.g. [4]) and similarity or dissimi-

larity (or distance) computation (e.g. [5]) are two of the main research areas. In this

chapter, we review the algorithms that have been recently developed for the recon-

struction of 3-D shapes from 2-D video sequences. This work is inspired by the fact

that the estimation of 3-D shapes critically affects the retrieval quality of 3-D mod-

els. We believe that the introduction to these summarised approaches here will be

used to effectively facilitate the application of 3-D model retrieval in the databases

or Internet. However, this potential application is beyond the scope of the current

report and omitted in the current report.

One of the commonly used strategies to recover 3-D shapes is the use of multiple

view reconstruction. For example, Bartoli and Sturm [6] used Plucker coordinates

to represent the 3-D lines in the scope of maximum likelihood estimation, and then

they proposed an orthonormal representation to challenge the bundle adjustment

problem. Zhou et al. [7] conducted co-planarity checks using cross-ratio invariants

and periodic analysis of the triangular regions. Klaus et al. [8] presents a segment-

based method to extract the regions of homogeneous colours, followed by local

window based matching, plane fitting and disparity assignment. Similar approaches

have been introduced in [9], [10]. Sun et al. [11] reported a stereo mathcing algo-

rithm using Bayesian belief propagation. The stereo problem was solved by taking

into account the three Markov random fields: a smooth field for depth/disparity, a

line process for depth discontinuity and binary process for occlusion. An iterative

RANSAC plane fitting strategy reported in [12] shows a maximum likelihood es-

timation approach. This technique enables one to obtain the best plane fitting to

the generated 3-D points automatically rather than using empirical criteria, which is

determined according to a limited number of image samples.

Regarding the non-linear surface reconstruction from motion, Laurentini reported

the visual hull as the largest volume consistent with the contours that have been ob-

served from several viewpoints [13]. This approach ignores the small details but

capture the approximate shape of the scene. Roy and Cox [14] introduced a method

using the graph flow theory to generalise the purely 1-D dynamic programming

technique to the 2-D problem raised by disparity maps. Kolmogorov and Zabih

[15] a graph cuts based general theory tp disparity maps in the multi-view con-

text. Narayanan et al. [16] reconstructed several depth maps that are aggregated into

High-Quality Visual Experience

Search WWH ::

Custom Search

Home