Graphics Reference
In-Depth Information
2.3.4 Object Detection and Tracking in Point Clouds
This section describes the approach to the detection and tracking of objects in three-
dimensional point clouds suggested by Schmidt et al. (
2007
). The presentation is
adopted from that work. The method relies on a motion-attributed point cloud ob-
tained with the spacetime stereo approach described in Sect.
1.5.2.5
. In a subsequent
step, motion-attributed clusters are formed which are then used for generating and
tracking object hypotheses.
2.3.4.1 Motion-Attributed Point Cloud
A three-dimensional representation of the scene is generated with the correlation-
based stereo vision algorithm by Franke and Joos (
2000
) and with the spacetime
stereo algorithm described by Schmidt et al. (
2007
) (cf. Sect.
1.5.2.5
). Both stereo
techniques generate three-dimensional points based on edges in the image, espe-
cially object boundaries. Due to the local approach they are independent of the ob-
ject appearance. While correlation-based stereo has the advantage of higher spatial
accuracy and is capable of generating more point correspondences, spacetime stereo
provides a velocity value for each stereo point. However, it generates a smaller
number of points and is spatially less accurate, since not all edges are necessarily
well described by the model defined in (
1.118
). Taking into account these proper-
ties of the algorithms, the results are merged into a single motion-attributed three-
dimensional point cloud. For each extracted three-dimensional point
c
k
an average
velocity
(
1
,...,J)
in an el-
lipsoid neighbourhood defined by
δ
S
(s
j
,c
k
)<
1 around
c
k
. To take into account
the spatial uncertainty in depth direction of the spacetime data,
δ
S
(s
j
,c
k
)
defines a
Mahalanobis distance whose correlation matrix
Σ
contains an entry
Σ
z
=
v(c
k
)
is calculated, using all spacetime points
s
j
,
j
¯
∈
1forthe
depth coordinate which can be derived from the recorded data, leading to
J
ρ
J
v(c
k
)
=
v(s
j
)
∀
s
j
:
δ
S
(s
j
,c
k
)<
1
.
(2.36)
j
=
1
The factor
ρ
denotes the relative scaling of the velocities with respect to the spatial
coordinates. It is adapted empirically depending on the speed of the observed ob-
jects. This results in a four-dimensional point cloud, where each three-dimensional
point is attributed with an additional one-dimensional velocity component parallel
to the epipolar lines; see Fig.
2.20
d.
A reference image of the observed scene is used to reduce the amount of data
to be processed by masking out three-dimensional points that emerge from static
parts of the scene, as shown in Figs.
2.20
a and b. Furthermore, only points within a
given interval above the ground plane are used, as we intend to localise objects and
humans and thus always assume a maximum height for objects above the ground.
Search WWH ::
Custom Search