Information Technology Reference
In-Depth Information
( n ) into the 3D scene associated with frame ( n
1) is computed (see Figure 1). This
rigid transform represents the 3D motion of the camera between frame ( n ) and frame
( n
+
1). Finally, moving objects are detected by computing the difference between
the 3D coordinates of points represented in the same coordinate system. Before
going into details in the stages of the proposed approach a brief description of the
used stereo vision system is given.
+
3.1
System Setup
A commercial stereo vision system (Bumblebee from Point Grey 1 ) is used to acquire
the 3D information of the scene in front of the host vehicle. It consists of two Sony
ICX084 Bayer pattern CCDs with 6mm focal length lenses. Bumblebee is a pre-
calibrated system that does not require in-field calibration. The baseline of the stereo
head is 12cm and it is connected to the computer by an IEEE-1394 interface. Right
and left color images (Bayer pattern) were captured at a resolution of 640
480
pixels. After capturing each right-left pair of images, a dense cloud of 3D data points
P n is computed by using a 3D reconstruction software at each frame n . The right
intensity image I n
×
is used during the feature point detection and tracking stage.
3.2
Feature Detection and Tracking
As previously mentioned, the proposed approach is intended to be used on on-board
vision systems for driver assistance applications. Hence, due to real time constraint,
it is clear that the whole cloud of points cannot be used to find the rigid transfor-
mation that maps two consecutive frames to the same reference system. In order to
tackle this problem, an efficient approach that relies only on the use of a reduced set
of points from the given image I n
is proposed. Feature points, f i ( u , v )
I n , far away
from the camera position ( P i ( x , y , z ) > δ
) are discarded in order to increase registration
accuracy 2 (
δ =
15 m in the current implementation).
The proposed approach does not depend on the technique used for detecting fea-
ture points; actually, two different approaches have been tested: one based on the
Harris corner points [10] and another on SIFT features [16]. In the first case, once
feature points have been selected a tracking window W T of (9
9) pixels is set. Fea-
ture points are tracked by minimizing the sum of squared differences between two
consecutive frames by using an iterative approach [17]. In the second case SIFT
features [16] are detected in the extreme of difference of Gaussians in a scale-space
representation and described as histograms of gradient orientations. In this case, fol-
lowing [16], a function based on the corresponding histograms distance is used to
match the features in consecutive frames (the public implementation of SIFT in [29]
has been used).
×
1 www.ptgrey.com
2
Stereo head data uncertainty grows quadratically with depth [19].
 
Search WWH ::




Custom Search