METHODS AND TECHNIQUES OF COMPLEX SYSTEMS SCIENCE: AN OVERVIEW - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

the ordinary Shannon entropy in the limit. There are entropy rates corresponding

to all the Rényi entropies, defined just like the ordinary entropy rate. For dy-

namical systems, these are related to the fractal dimensions of the attractor

(162,163).

The Rényi divergences bear the same relation to the Rényi entropies as

the Kullback-Leibler divergence does to the Shannon entropy. The defining

formula is

B

-

1

p

D

(P || Q)

w

log

q

i

,

[54]

--

B

i

B

1

q

®

i

and similarly for the continuous case. Once again,

lim B l D B (P||Q) = D (P||Q).

For all B > 0, D B (P||Q) 0, and is equal to zero if and only if P and Q are the

same. (If B = 0, then a vanishing Rényi divergence only means that the supports

of the two distributions are the same.) The Rényi entropy H B [ X ] is nonincreasing

as B grows, whereas the Rényi divergence D B (P||Q) is nondecreasing.

1

Estimation of Information-Theoretic Quantities . In applications, we will

often want to estimate theoretic quantities, such as the Shannon entropy or the

mutual information, from empirical or simulation data. Restricting our attention,

for the moment, to the case of discrete-valued variables, the empirical distribu-

tion will generally converge on the true distribution, and so the entropy (say) of

the empirical distribution ("sample entropy") will also converge on the true en-

tropy. However, it is not the case that the sample entropy is an unbiased estimate

of the true entropy. The Shannon (and Rényi) entropies are measures of varia-

tion, like the variance, and sampling tends to reduce variation. Just as the sample

variance is a negatively biased estimate of the true variance, sample entropy is a

negatively biased estimate of the true entropy, and so sample mutual information

is a positively biased estimate of true information. Understanding and control-

ling the bias, as well as the sampling fluctuations, can be very important.

Victor (164) has given an elegant method for calculating the bias of the

sample entropy; remarkably, the leading-order term depends only on the alpha-

bet size k and the number of samples N , and is ( k -1)/2 N . Higher-order terms,

however, depend on the true distribution. Recently, Kraskov et al. (165) have

published an adaptive algorithm for estimating mutual information, which has

very good properties in terms of both bias and variance. Finally, the estimation

of entropy rates is a somewhat tricky matter. The best practices are to either use

an algorithm of the type given by (166), or to fit a properly dynamical model.

(For discrete data, variable-length Markov chains, discussed in §3.6.2 above,

generally work very well, and the entropy rate can be calculated from them very

simply.) Another popular approach is to run one's time series through a standard

compression algorithm, such as gzip , dividing the size in bits of the output by

Complex Systems Science in Biomedicine

Search WWH ::

Custom Search

Home