Consumer Robotics: A Platform for Embedding Computer Vision in Everyday Life - Advances in Embedded Computer Vision

Graphics Reference

In-Depth Information

SLAM graph complexity during operation, using variable elimination and constraint

pruning with heuristic schedules. These methods keep optimization and storage costs

commensurate with explored area rather than with time of exploration while causing

minimal loss in mapping and localization accuracy.

An instantiation of the approach is demonstrated on real datasets with planar

ground-truth reference. The system operates successfully even at frame rates below

2Hz. Comparing the results with and without complexity reduction demonstrates

that the reduced graph yields similar localization accuracy at a small fraction of the

computational cost.

2.2 Related Work

2.2.1 View Recognition for SLAM

View recognition engines have proven attractive components for SLAM systems

because they permit robust and flexible loop closing. Instead of making correspon-

dences between individual features ormeasurements, visual or otherwise, viewrecog-

nition engines typically match constellations of features or entire images without

requiring feature tracking.

Williams et al. [ 20 ] rely on tracking for normal EKF SLAM operation, but use

view recognition to recover from failure. Several features are matched to the existing

map using appearance and structure constraints in order to reinitialize tracking.

The Parallel Tracking andMapping(PTAM) [ 10 ] systemalso employs view recog-

nition for recovery from tracking failure. Instead of using feature-based methods for

identifying candidate views, the system performs image-to-image correlation using

heavily blurred, low-resolution versions of the reference and query images. A crude

pose estimate is deduced from the result of the inverse-compositional matching,

following which tracking resumes.

Eade andDrummond [ 4 ] group subsets of features into local maps during tracking-

based SLAM. Correspondences are made between local maps to connect them or

to recover from tracking failures. The image-to-map matching first selects a subset

of local maps to consider using a bag-of-words ranking, and then performs local

matching to determine feature-to-feature correspondences. This two-step process is

common to many view recognition systems, often instantiated as a bag-of-words

prefilter followed by re-ranking using geometric constraints [ 17 ].

The above approaches rely on tracking and use view recognition as an out-of-

band method for failure recovery. Our approach instead performs recognition at

every time step as the primary source of observations. The system of Karlsson et al.

[ 9 ] is similar, constructing landmarks out of constellations of SIFT [ 14 ] features

and employing nearest neighbors and a simple Hough transform as the recognition

algorithm. The system is further refined by Eade et al. [ 5 ] by replacing the particle-

filter back end with a graph SLAM back end that is described in further detail in

Search WWH ::

Custom Search

Home