Graphics Reference
In-Depth Information
needs to run at real-time frame rates and with the “always on-always Augmented”
use case, the power usage of the mobile device becomes a major challenge since the
battery may drain out within approximately one hour. Next we look into some of the
AR technologies that are commonly used.
Marker-based Tracking : In order to obtain the camera pose in real-time, marker-
based techniques are used. One example of marker-based tracking is the IKEA app
fromMetaio that uses tracking andmonocular simultaneous localization andmapping
(SLAM) algorithms. The marker is used to obtain the scale of the room. Markers are
easily detected in the image due to their unique color and pattern. The high contrast
combination of black-and-white square block pattern used along with four known
marker points provides accurate calculation of camera pose. The issue is that the
marker should always be visible in the camera frame of view and is susceptible to
illumination variation.
Marker-less Tracking : The typical “marker-less” pipeline takes a video frame,
extracts features like corners, describes them in a descriptor vector, and matches
them against a database of reference object descriptors, which have been previously
recorded. After the objects are detected they are tracked frame by frame. The key for
a robust, accurate and fast 3D feature tracking pipeline is to find the right balance
between number of features, pyramid scaling, and recording the 'right' information
in the descriptors. This task requires a lot of experience, real-life expertise, and
validation. Thus, not many really good 3D feature trackers are available in themarket.
The amount of detected feature points depends on the size and complexity of the
object or environment to be detected and tracked. Typically, for a single 3D object,
the algorithm has to deal with 1,000-2,000 feature points, for small rooms about
5,000 features, and for outdoor scenarios 10,000-20,000 features. These reference
features have to be matched with all the new detected features estimated every at
30 fps, resulting in more than 200 GOPS for the detection or initialization phase,
whereas the tracking phase is less demanding. SLAM on the other hand tries to
localize the camera in the mapped environment and then estimates the camera pose
relative to themapped environment. The better we canmap the environment, themore
precise is the camera pose and vice versa. The common feature detectors can extract
corners, blobs, patches, and edges and only a few feature detectors such as FAST [ 16 ]
are suitable for embedded real-time processing. In some scenarios, dense tracking
is needed to compute structure from motion and Lucas Kanade feature tracking is
widely used. Feature matching is performed from frame to frame or key to frame to
frame using template-based or feature descriptors. These algorithms require a large
amount of computation and memory bandwidth which has a direct impact on the
power requirement.
Edge-based Tracking : Currently, tracking and mapping approaches based on
distinctive feature points are the most common algorithms employed for AR pur-
poses. Usually, 2D points in images are selected and represented using the standard
computer vision detector-descriptor scheme. Positive features of point-based track-
ing approaches are the following:
high level of invariance to rotation and translation;
Search WWH ::




Custom Search