Graphics Reference
In-Depth Information
2.3.4 Object Detection and Tracking in Point Clouds
This section describes the approach to the detection and tracking of objects in three-
dimensional point clouds suggested by Schmidt et al. ( 2007 ). The presentation is
adopted from that work. The method relies on a motion-attributed point cloud ob-
tained with the spacetime stereo approach described in Sect. 1.5.2.5 . In a subsequent
step, motion-attributed clusters are formed which are then used for generating and
tracking object hypotheses.
2.3.4.1 Motion-Attributed Point Cloud
A three-dimensional representation of the scene is generated with the correlation-
based stereo vision algorithm by Franke and Joos ( 2000 ) and with the spacetime
stereo algorithm described by Schmidt et al. ( 2007 ) (cf. Sect. 1.5.2.5 ). Both stereo
techniques generate three-dimensional points based on edges in the image, espe-
cially object boundaries. Due to the local approach they are independent of the ob-
ject appearance. While correlation-based stereo has the advantage of higher spatial
accuracy and is capable of generating more point correspondences, spacetime stereo
provides a velocity value for each stereo point. However, it generates a smaller
number of points and is spatially less accurate, since not all edges are necessarily
well described by the model defined in ( 1.118 ). Taking into account these proper-
ties of the algorithms, the results are merged into a single motion-attributed three-
dimensional point cloud. For each extracted three-dimensional point c k an average
velocity
( 1 ,...,J) in an el-
lipsoid neighbourhood defined by δ S (s j ,c k )< 1 around c k . To take into account
the spatial uncertainty in depth direction of the spacetime data, δ S (s j ,c k ) defines a
Mahalanobis distance whose correlation matrix Σ contains an entry Σ z =
v(c k ) is calculated, using all spacetime points s j , j
¯
1forthe
depth coordinate which can be derived from the recorded data, leading to
J
ρ
J
v(c k ) =
v(s j )
s j : δ S (s j ,c k )< 1 .
(2.36)
j
=
1
The factor ρ denotes the relative scaling of the velocities with respect to the spatial
coordinates. It is adapted empirically depending on the speed of the observed ob-
jects. This results in a four-dimensional point cloud, where each three-dimensional
point is attributed with an additional one-dimensional velocity component parallel
to the epipolar lines; see Fig. 2.20 d.
A reference image of the observed scene is used to reduce the amount of data
to be processed by masking out three-dimensional points that emerge from static
parts of the scene, as shown in Figs. 2.20 a and b. Furthermore, only points within a
given interval above the ground plane are used, as we intend to localise objects and
humans and thus always assume a maximum height for objects above the ground.
Search WWH ::




Custom Search