Information Theory and Principal Component Analysis - Biomedical Signal Analysis: Contemporary Methods and Applications

Digital Signal Processing Reference

In-Depth Information

understand the structure of this 28 2 -dimensional space given by the

samples x (1) ,..., x (1000). For this we determine a dimension reduction

onto its first few principal components.

We calculate the 784

×

784-dimensional covariance matrix and plot

the eigenvalues in decreasing order ( figure 3.6(b)). No clear cutoff can be

determined from the eigenvalue distribution. However, by choosing only

the first two eigenvalues (0 . 25% of all eigenvalues), we already capture

22 . 6% of the total eigenvalues:

d 11 + d 22

784

i=1

d ii ≈

0 . 226 .

And indeed, the first two eigenvalues are already su cient to distin-

guish between the general shapes 2 and 4, as can be seen in the plot

figure 3.6(c), where the 4s have a significantly lower second PC than the

2s.

From the previous analysis, we can deduce that the first few PCs

already capture important information of the data. This implies that

we might be able to represent our data set using only the first few

PCs, which results in a compression method. In figure 3.7, we show the

truncated PCA expansion

k

x =

e i y i

i=1

) 2 is

precisely the sum of the remaining eigenvalues. We see that with only a

few eigenvalues, we can already capture the basic digit shapes.

when varying the truncation index k . The resulting error E (

|

x

−

x

|

EXERCISES

1. Calculate the first four centered moments of a in a [0 ,a ] uniform

random variable.

2. Show that the variance of the sum i X i of uncorrelated random

variables X i equals the sum of the variances var X i .

3. Show that the kurtosis of a Gaussian random variable vanishes,

and prove that the uneven moments of a symmetric density vanish

as well.

Biomedical Signal Analysis: Contemporary Methods and Applications

Search WWH ::

Custom Search

Home