Biomedical Engineering Reference
In-Depth Information
cross-correlation function produces a value close to 1 for a given
τ
, then the signals
are considered to be lag synchronized by a phase of
. Hence the final feature used to
calculate the lag synchronization is the largest normalized cross correlation over all
values of
τ
, as shown in (6.20). A C max value of 1 indicates totally synchronized sig-
nals within some time lag
τ
τ
and unsynchronized signals produce a value very close to
0.
{
}
(
)( )
C
= τ
max
C s
,
s
τ
(6.20)
max
a
b
6.6
Principal Component Analysis
Principal component analysis attempts to solve the problem of excessive
dimensionality by combining features to reduce the overall dimensionality. By using
linear transformations, it projects a high dimensional dataset onto a lower dimen-
sional space so that the information in the original dataset is preserved in an optimal
manner when using the least squared distance metric. An outline of the derivation of
PCA is given here. The reader should refer to Duda et al. [40] for a more detailed
mathematical derivation.
Given a d -dimensional dataset of size n ( x 1 , x 2 ,…, x n ), we first consider the prob-
lem of finding a vector x 0 to represent all of the vectors in the dataset. This comes
down to the problem of finding the vector x 0 , which is closest to every point in the
dataset. We can find this vector by minimizing the sum of the squared distances
between x 0 and all of the points in the dataset. In other words, we would like to find
the value of x 0 that minimizes the criterion function J 0 shown in (6.21):
n
()
2
J
x
=
x
x
(6.21)
0
0
0
k
k=
1
It can be shown that the value of x 0 that minimizes J 0 is the sample mean (1/ N
Σ
x i ) of the dataset [40]. The sample mean has zero dimensionality and therefore
does not give any information about the spread of the data, because it is a single
point. To represent this information, the dataset would need to be projected onto a
space with some dimensionality. To project the original dataset onto a one-dimen-
sional space, we need to project it onto a line in the original space that runs through
the sample mean. The data points in the new space can then be defined by x
a e .
Here, e is the unit vector in the direction of the line and a is a scalar, which represents
the distance from m to x . A second criterion function J 1 can now be defined that cal-
culates the sum of the squared distances between the points in the original dataset
and the projected points on the line:
=
m
+
n
2
(
)
(
)
Ja
,
,
a
,
e
=
m
+
a
+
e
x
(6.22)
n
11
k
k
k
1
Taking into consideration that || e ||
=
1, the value of a k that minimizes J 1 is found
e t ( x k
to be a k =
m ). To find the best direction e for the line, this value of a k is substi-
 
Search WWH ::




Custom Search