Dictionary Learning - Sparse Representations and Compressive Sensing for Imaging and Vision

Digital Signal Processing Reference

In-Depth Information

for each dictionary item. A Gaussian Process (GP) model is proposed for sparse

representation to optimize the dictionary objective function. The sparse coding

property allows a kernel with a compact support in GP to realize a very efficient

dictionary learning process. Hence, video of an activity can be described by a set of

compact and discriminative action attributes.

Given the initial dictionary B o , the objective is to compress it into a dictionary

B ∗ of size k , which encourages the signals from the same class to have very similar

sparse representations. Let L denote the labels of M discrete values, L

∈ [

]

Given a set of dictionary atoms B ∗ ,define P

B ∗ )=

(

B ∗ | ∑ b i ∈ B ∗ P

(

b i )

.For

b ∗ )

B ∗ )

simplicity, denote P

. To enhance the

discriminative power of the learned dictionary, the following objective function is

considered

(

as P

(

L b ∗ )

,and P

(

as P

(

L B ∗ )

B ∗ ; B o

B ∗ )+ λ

arg max

B ∗

(

L B ∗ ; L B o

B ∗ )

(6.2)

where

0 is the parameter to regularize the emphasis on appearance or label

information and I denotes mutual information. One can approximate ( 6.2 )as

λ ≥

b ∗ |

B ∗ ) −

b ∗ |

B ∗ )]

arg

max

b ∗ ∈ B o

\ B ∗ [

(

+ λ [

(

L b ∗ |

L B ∗ ) −

(

L b ∗ |

L B ∗ )] ,

(6.3)

where H denotes entropy. One can easily notice that the above formulation also

forces the classes associated with b ∗ to be most different from classes already

covered by the selected atoms B ∗ ; and at the same time, the classes associated with

b ∗ are most representative among classes covered by the remaining atoms. Thus the

learned dictionary is not only compact, but also covers all classes to maintain the

discriminability.

In Figure 6.1 , we present the recognition accuracy on the Keck gesture dataset

with different dictionary sizes and over different global and local features [115].

Leave-one-person-out setup is used. That is, sequences performed by a person are

left out, and the average accuracy is reported. Initial dictionary size

B o

is chosen

to be twice the dimension of the input signal and sparsity 10 is used in this set

of experiments. As can be seen the mutual information-based method, denoted as

MMI-2 outperforms the other methods.

Sparse representation over a dictionary with coherent atoms has the multiple

representation problem. A compact dictionary consists of incoherent atoms, and

encourages similar signals, which are more likely from the same class, to be

consistently described by a similar set of atoms with similar coefficients [115]. A

discriminative dictionary encourages signals from different classes to be described

by either a different set of atoms, or the same set of atoms but with different

coefficients [71, 82, 119]. Both aspects are critical for classification using sparse

representation. The reconstructive requirement to a compact and discriminative

dictionary enhances the robustness of the discriminant sparse representation [119].

Sparse Representations and Compressive Sensing for Imaging and Vision

Search WWH ::

Custom Search

Home