Sparse Blind Source Separation - Sparse Image and Signal Processing

Digital Signal Processing Reference

In-Depth Information

function to optimize, most ICA algorithms can be equivalently restated in a “nat-

ural gradient” form (Amari 1999; Amari and Cardoso 1997). In such a setting,

the demixing matrix B is estimated iteratively: B ( t + 1)

B ( t )

+ μ ∇ B ( B ( t ) ). The

“natural gradient”

∇ B at B is given by

( S ) S T B

N H

∇ B ( B )

∝

−

(9.8)

( S ) in equation (9.8) is

the so-called score function, which is closely related to the pdf of the sources

(Cichocki and Amari 2002; Amari and Cardoso 1997). Assuming that all the

sources are generated from the same joint pdf pdf S , the entries of

S is the estimate of S :

where

BY .Thematrix

( S ) are the

partial derivatives of the log-likelihood function

log(pdf S ( S ))

∂

=− ∂

( S )[ i

N s }×{

,...,

} .

l ]

, ∀

( i

l )

∈{

,...,

(9.9)

S [ i

l ]

As expected, the way the demixing matrix (and thus the sources) is estimated

closely depends on the way the sources are modeled (from a statistical point

of view). For instance, separating platykurtic (distribution with negative kurto-

sis) or leptokurtic (distribution with positive kurtosis) sources will require com-

pletely different score functions. Even if ICA is shown by Amari and Cardoso

(1997) to be quite robust to so-called mismodeling, the choice of the score func-

tion is crucial with respect to the convergence (and rate of convergence) of ICA

algorithms. Some ICA-based techniques (Koldovsky et al. 2006) focus on adapt-

ing the popular FastICA algorithm to adjust the score function to the distribution

of the sources. They particularly focus on modeling sources whose distributions

belong to specific parametric classes of distributions such as generalized Gaus-

sian distribution (GGD).

Noisy ICA: Only a few works have investigated the problem of noisy ICA

(Davies 2004; Koldovsky and Tichavsky 2006). As pointed out by Davies (2004),

noise clearly degenerates the ICAmodel: it is not fully identifiable. In the case of

additive Gaussian noise, as stated in equation (9.2), using higher-order statistics

yields an effective estimate of the mixing matrix A

B − 1 (higher-order cumu-

lants are indeed blind to additive Gaussian noise; this property does not hold for

non-Gaussian noise). But in the noisy ICA setting, applying the demixing matrix

to the data does not yield an effective estimate of the sources. Furthermore, most

ICA algorithms assume the mixing matrix A to be square. When there are more

observations than sources ( N c >

N s ), a dimension reduction step is first applied.

When noise perturbs the data, this subspace projection step can dramatically

deteriorate the performance of the separation stage.

In the following, we will introduce a new way of modeling the data so as to avoid

most of the aforementioned limitations of ICA.

9.2.5 Toward Sparsity

The seminal paper of Zibulevsky and Pearlmutter (2001) introduced sparsity as an

alternative to standard contrast functions in ICA. In their work, each source s i was

Sparse Image and Signal Processing

Search WWH ::

Custom Search

Home