Information Technology Reference
In-Depth Information
it initially increases with increase in the value of the radius (linear increase for small
values of r ) and then saturates (for large values of r ) [3]. It was found that using a
value of r
vs. ln r curve produces
consistent changes in TE with changes in directional coupling.
Although such an estimation of r is possible in noiseless simulation data, for
physiological data sets that are always noisy, and the underlying functional descrip-
tion is unknown, it is difficult to estimate an optimal value r
within the quasilinear region of the ln C n (
r
)
simply because a
linear region of ln C n (
vs. ln r may not be apparent or even exist. It is known that
the presence of noise in the data will be predominant for small r values [10, 8] and
over the entire space (high dimensional). This causes the distance between neigh-
borhood points to increase. Consequently, the number of neighbors available to es-
timate the multidimensional probabilities at the smaller scales may decrease and it
would lead to a severely biased estimate of TE. On the other hand, at large values
of r , a flat region in ln C n (
r
)
may be observed (saturation). In order to avoid the
above shortcomings in the practical application of this method (e.g., in simulation
models with added noise or in the EEG), we approximated TE as the average of TEs
estimated over an intermediate range of r values (from
r
)
/5). The decision
to use this range for r was made on the practical basis that r less than
σ
/5 to 2
σ
σ
/2 typically
(well-behaved data) avoids saturation and r larger than
/10 typically filters a large
portion of A/D-generated noise (simulation examples offer corroborative evidence
for such a claim). Even though these criteria are soft for r (no exhaustive testing of
the influence of the range of r on the final results), it appears that the proposed range
constitutes a very good compromise (sensitivity and specificity-wise) for the subse-
quent detection of the direction and magnitude of flow of entropy (see Section 15.3).
Finally, to either a larger or lesser degree, all existing measures of causality suffer
from the finite sample effect. Therefore, it is important to always test their statistical
significance using surrogate techniques (see next subsection).
σ
15.2.3 Statistical Significance of Transfer Entropy
Since TE calculates the direction of information transfer between systems by quan-
tifying their conditional statistical dependence, a random shuffling applied to the
original driver data series Y destroys the temporal correlation and significantly re-
duces the information flow TE( Y
X ). Thus, in order to estimate the statistically
significant values of TE( Y
X ), the null hypothesis that the current state of the
driver process Y does not contain any additional information about the future state
of the driven process X was tested against the alternate hypothesis of a signifi-
cant time dependence between the future state of X and the current state of Y .
One way to achieve this is to compare the estimated values of TE( Y
X ) (i.e.,
x ( k )
y ( l )
), thereafter denoted by TE o , with the TE values estimated by
studying the dependence of future state of X on the values of Y at randomly shuffled
time instants (i.e., TE( x n + 1 |
|
,
)
the TE( x n + 1
n
n
x ( k )
y ( l )
N
is selected from the shuffled time instants of Y . The above described surrogate
,
)
), thereafter denoted by TE s , where p
1
,...,
n
p
Search WWH ::




Custom Search