Graphics Reference
In-Depth Information
the following sections. The work of Cummins et al. [ 2 ] takes a more sophisticated
approach to recognition, building a visual vocabulary offline, and approximating
the joint probability distribution of visual words with a Chow-Liu tree. Each view's
appearance model is updated upon recognition.
Our view recognition front end bears many similarities to the view-based maps
of Konolige et al. [ 12 ]. That system constructs views from stereo images and per-
forms two-step recognition using first a vocabulary tree and then a geometric match-
ing stage. Views (called skeleton frames ) are constructed from the output of visual
odometry, which requires a frame rate sufficient for tracking. We require onlymonoc-
ular imagery, constructing structured appearance models from two matched views
of the same scene. While Konolige et al. use randomized tree signatures for feature
matching, we use a simple variant on SIFT features and local and global feature
databases.
2.2.2 Graph-Based SLAM
Storing observations and poses in a constraint graph is now a well-explored technique
for localization and mapping. The graph formulation provides a straightforward and
flexible representation of the underlying Gaussian Markov random field (GMRF)
problem that SLAM attempts to solve. The general framework is described in [ 18 ],
including a description of a graph relaxation procedure identical to batch bundle
adjustment in photogrammetry [ 19 ]. Relaxation algorithms for SLAM graphs have
received much attention, especially with online operation in mind. Olson et al. [ 16 ]
suggest a stochastic gradient descent method, and Grisetti et al. [ 7 ] review that and
related methods for incremental graph optimization.
The system of Eade and Drummond [ 3 ] forms a graph where each node is a joint
distribution over a local map, and the relative nonlinear constraints between nodes
are derived from shared features. The graph is relaxed by imposing cycle constraints
using preconditioned gradient descent. The network constructed by PTAM is effec-
tively a graph of relative constraints between keyframes, though the optimization,
performed asynchronously to the primary tracking task, acts on individual structure
elements.
The view-based mapping of Konolige et al. [ 12 ] constructs a reduced graph of
poses by consolidating consecutive frames tracked by visual odometry into skeleton
frames. Then the constraint graph over skeleton frames is incrementally relaxed using
the Toro method [ 8 ].
While existing graph-based SLAM methods employ incremental graph optimiz-
ers to allow online operation, the number of poses in the graph continues to growwith
time. One technique suggested for bounding this growth is that the robot be occasion-
ally virtually “kidnapped,” disconnecting its current pose in the graph from previous
poses and re-inserting it in using only recent observations [ 8 ]. This assumes both that
the recent observations are sufficiently accurate to allow relocalization, and that the
effective uncertainty of these observations is zero. These assumptions are routinely
Search WWH ::




Custom Search