Biomedical Engineering Reference
In-Depth Information
such inadequate filter outputs. On the other hand, when the correlation exceeds δ
1
, the variable selec-
tion algorithm runs until
C
max
(
j
) becomes lower than the second threshold, δ
2
. We empirically search
certain values of [δ
1
, δ
2
] with which the probability of selecting at least one channel is very low for both
surrogate data sets, but reasonably high for the original data.
To determine δ
1
, a correlation between
ˆ
d
( )
n
and
d
(
n
) is recursively estimated such as
LMS
ˆ
d
n d n
p n q n
( ) ( )
( ) ( )
ξ µξ
( )
n
=
(
n
− +
1
)
LMS
,
(4.47)
where
μ
is a forgetting factor that is usually defined for the recursive least squares (RLS) [
14
].
p
(
n
)
and
q
(
n
) represent the power estimates for
ˆ
LMS
and
d
(
n
), respectively. Normalization by the
square root of
p
(
n
)
q
(
n
) prevents ξ(
n
) from being biased to a large magnitude of
d
(
n
). The power is
estimated through similar recursions:
d
( )
n
ˆ
p n
( )
=
µ
µ
p n
(
− +
1
1
)
d
2
( )
n
.
(4.48)
LMS
2
q n
( )
=
q n
(
− +
)
d n
( )
If ξ(
n
)
≥
δ
1
, the online variable selection algorithm is activated. If ξ(
n
) < δ
1
, an empty subset
is yielded and
ˆ
is set equal to
ˆ
LMS
.
Once the online variable selection algorithm is started, δ
2
is used to stop the algorithm until
C
max
(
j
) < δ
2
at the
j
th iteration. Here, we describe how
C
max
(
j
) represents the correlation of inputs
with a desired output. In the LAR, if two successively selected inputs,
x
j
−1
and
x
j
, have similar cor-
relations with a desired output,
d
, the decrease from
C
max
(
j
− 1) to
C
max
(
j
) will be small. On the
other hand, if
x
j
−1
has more correlation than
x
j
,
C
max
(
j
) will be much smaller than
C
max
(
j
− 1). This is
illustrated in Figure
4.13
. Consider the data [
X
,
d
], where
X
is an input matrix whose rows are input
samples, and
d
is an output vector. Suppose that
X
has two columns,
x
1
and
x
2
. We assume that
x
1
,
x
2
, and
d
are standardized with zero mean and unit variance. Suppose
x
1
has more correlation with
d
. The LAR starts to move in the direction of
x
1
. It finds the coefficient β
1
for
x
1
such that |
x
1
T
r
1
|
= |
x
2
T
r
1
|, where
r
1
=
d
- β
1
x
1
≡
d
- y
1
. Then, the maximum correlation changes from
C
max
(0) = |
x
1
T
d
|
to
C
max
(1) = |
x
1
T
r
1
| = |
x
2
T
r
1
|. From this, we can see that the angle between
x
1
and
r
1
is equal to the
angle between
x
2
and
r
1
the equiangular property of the LAR.
C
max
(
j
) is related with the angles
such that
C
max
(
j
) ≈ cosθ
j
, where θ
j
represents the angle at the
j
th step. The diagram on the left side
of Figure
4.13
illustrates the case when
x
1
and
x
2
have similar correlations with
d
. In this case, a
small difference between θ
0
and θ
1
leads to a small decrease from
C
max
(0) to
C
max
(1). On the other
hand, the diagram on the right side illustrates the case when
x
2
is considerably less correlated with
d
than
x
.
In this case, a large difference between θ
0
and θ
1
causes
C
max
(1) to decrease significantly.
d
( )
n
d
( )
n
LAR