Digital Signal Processing Reference
In-Depth Information
4.1.4
Compressive Sensing for Multi-View Tracking
Another direct application of CS to a data-rich tracking problem is presented by
[116]. Specifically, a method for using multiple sensors to perform multi-view
tracking employing a coding scheme based on compressive sensing is developed.
Assuming that the observed data contains no background component (this could be
realized, e.g., by preprocessing using any of the background subtraction techniques
previously discussed), the method uses known information regarding the sensor
geometry to facilitate a common data encoding scheme based on CS. After data from
each camera is received at a central processing station, it is fused via CS decoding
and the resulting image or three dimensional grid can be used for tracking.
The first case considered is one where all objects of interest exist in a known
ground plane. It is assumed that the geometric transformation between it and each
sensor plane is known. That is, if there are C cameras, then the homographies
{
C
j = 1
in the j th
H j }
are known. The relationship between coordinates
(
u
,
v
)
image
and the corresponding ground plane coordinates
(
x
,
y
)
is determined by H j as
u
v
1
x
y
1
,
H j
(4.10)
where the coordinates are written in accordance with their homogeneous represen-
tation. Since H j can vary widely across the set of cameras due to varying viewpoint,
an encoding scheme designed to achieve a common data representation is presented.
First, the ground plane is sampled, yielding a discrete set of coordinates
N
i =
{ (
x i ,
y i ) }
1 .
An occupancy vector, x , is defined over these coordinates, where x
1if
foreground is present at the corresponding coordinates and is 0 otherwise. For each
camera's observed foreground image in the set
(
n
)=
C
j =
1 , an occupancy vector y j is
{
I j }
formed as y j (
i
)=
I j (
u i ,
v i )
,where
(
u i ,
v i )
are the (rounded) image plane coordinates
obtained via ( 4.10 ). Thus, y j =
corresponding to
e j ,where e j represents
any error due to the coordinate rounding and other noise. Figure 4.3 illustrates the
physical configuration of the system.
Noting that x is often sparse, the camera data
(
x i ,
y i )
x
+
y j }
C
j = 1
{
is encoded using
C
j
{ Φ
}
compressive sensing. First, C measurement matrices
1 of equal dimension
are formed according to a construction that affords them the RIP of appropriate
order for x . Next, the camera data is projected into the lower-dimensional space by
computing y j
j
=
j y j , j
= Φ
=
,...,
C . This lower-dimensional data is transmitted to
a central station, where it is ordered into the following structure:
1
y 1
.
y C
Φ 1
.
Φ C
e 1
.
e C
=
x
+
(4.11)
Search WWH ::




Custom Search