Digital Signal Processing Reference
In-Depth Information
for each dictionary item. A Gaussian Process (GP) model is proposed for sparse
representation to optimize the dictionary objective function. The sparse coding
property allows a kernel with a compact support in GP to realize a very efficient
dictionary learning process. Hence, video of an activity can be described by a set of
compact and discriminative action attributes.
Given the initial dictionary
B
o
, the objective is to compress it into a dictionary
B
∗
of size
k
, which encourages the signals from the same class to have very similar
sparse representations. Let
L
denote the labels of
M
discrete values,
L
∈
[
1
,
M
]
.
1
Given a set of dictionary atoms
B
∗
,define
P
B
∗
)=
(
L
|
B
∗
|
∑
b
i
∈
B
∗
P
(
L
|
b
i
)
.For
|
b
∗
)
B
∗
)
simplicity, denote
P
. To enhance the
discriminative power of the learned dictionary, the following objective function is
considered
(
L
|
as
P
(
L
b
∗
)
,and
P
(
L
|
as
P
(
L
B
∗
)
B
∗
;
B
o
B
∗
)+
λ
arg max
B
∗
I
(
\
I
(
L
B
∗
;
L
B
o
B
∗
)
(6.2)
\
where
0 is the parameter to regularize the emphasis on appearance or label
information and
I
denotes mutual information. One can approximate (
6.2
)as
λ
≥
b
∗
|
B
∗
)
−
b
∗
|
B
∗
)]
arg
max
b
∗
∈
B
o
\
B
∗
[
H
(
H
(
+
λ
[
(
L
b
∗
|
L
B
∗
)
−
(
L
b
∗
|
L
B
∗
)]
,
H
H
(6.3)
where
H
denotes entropy. One can easily notice that the above formulation also
forces the classes associated with
b
∗
to be most different from classes already
covered by the selected atoms
B
∗
; and at the same time, the classes associated with
b
∗
are most representative among classes covered by the remaining atoms. Thus the
learned dictionary is not only compact, but also covers all classes to maintain the
discriminability.
In Figure
6.1
, we present the recognition accuracy on the Keck gesture dataset
with different dictionary sizes and over different global and local features [115].
Leave-one-person-out setup is used. That is, sequences performed by a person are
left out, and the average accuracy is reported. Initial dictionary size
B
o
is chosen
to be twice the dimension of the input signal and sparsity 10 is used in this set
of experiments. As can be seen the mutual information-based method, denoted as
MMI-2 outperforms the other methods.
Sparse representation over a dictionary with coherent atoms has the multiple
representation problem. A compact dictionary consists of incoherent atoms, and
encourages similar signals, which are more likely from the same class, to be
consistently described by a similar set of atoms with similar coefficients [115]. A
discriminative dictionary encourages signals from different classes to be described
by either a different set of atoms, or the same set of atoms but with different
coefficients [71, 82, 119]. Both aspects are critical for classification using sparse
representation. The reconstructive requirement to a compact and discriminative
dictionary enhances the robustness of the discriminant sparse representation [119].
|
|
Search WWH ::
Custom Search