Moving Object Detection from Mobile Platforms Using Stereo Data Registration - Computational Intelligence Paradigms in Advanced Pattern Classification

Information Technology Reference

In-Depth Information

( n ) into the 3D scene associated with frame ( n

1) is computed (see Figure 1). This

rigid transform represents the 3D motion of the camera between frame ( n ) and frame

( n

+

1). Finally, moving objects are detected by computing the difference between

the 3D coordinates of points represented in the same coordinate system. Before

going into details in the stages of the proposed approach a brief description of the

used stereo vision system is given.

+

3.1

System Setup

A commercial stereo vision system (Bumblebee from Point Grey 1 ) is used to acquire

the 3D information of the scene in front of the host vehicle. It consists of two Sony

ICX084 Bayer pattern CCDs with 6mm focal length lenses. Bumblebee is a pre-

calibrated system that does not require in-field calibration. The baseline of the stereo

head is 12cm and it is connected to the computer by an IEEE-1394 interface. Right

and left color images (Bayer pattern) were captured at a resolution of 640

480

pixels. After capturing each right-left pair of images, a dense cloud of 3D data points

P n is computed by using a 3D reconstruction software at each frame n . The right

intensity image I n

×

is used during the feature point detection and tracking stage.

3.2

Feature Detection and Tracking

As previously mentioned, the proposed approach is intended to be used on on-board

vision systems for driver assistance applications. Hence, due to real time constraint,

it is clear that the whole cloud of points cannot be used to find the rigid transfor-

mation that maps two consecutive frames to the same reference system. In order to

tackle this problem, an efficient approach that relies only on the use of a reduced set

of points from the given image I n

is proposed. Feature points, f i ( u , v ) ⊂

I n , far away

from the camera position ( P i ( x , y , z ) > δ

) are discarded in order to increase registration

accuracy 2 (

δ =

15 m in the current implementation).

The proposed approach does not depend on the technique used for detecting fea-

ture points; actually, two different approaches have been tested: one based on the

Harris corner points [10] and another on SIFT features [16]. In the first case, once

feature points have been selected a tracking window W T of (9

9) pixels is set. Fea-

ture points are tracked by minimizing the sum of squared differences between two

consecutive frames by using an iterative approach [17]. In the second case SIFT

features [16] are detected in the extreme of difference of Gaussians in a scale-space

representation and described as histograms of gradient orientations. In this case, fol-

lowing [16], a function based on the corresponding histograms distance is used to

match the features in consecutive frames (the public implementation of SIFT in [29]

has been used).

×

1 www.ptgrey.com

2

Stereo head data uncertainty grows quadratically with depth [19].

Computational Intelligence Paradigms in Advanced Pattern Classification

Search WWH ::

Custom Search

Home