PREDICTING SEPTICEMIA IN NEONATES - An Invitation to Biomathematics

Biology Reference

In-Depth Information

The various existing rules of thumb generally lead to the use of values of

m of 1 or 2 for data records of length N ranging from 100 to 5000 data

points and values of r between 0.1 and 0.25 to determine the tolerance. In

general, the accuracy and confidence of the entropy estimate improve as

the numbers of matches of length m and m

þ

1 increase. By this token,

the number of matches can be increased by choosing a small m (short

templates) and a larger r (wide tolerance). There are penalties, however,

for criteria that are too relaxed: as r increases, the probability of matches

tends toward 1, and SampEn tends to 0 for all processes, thereby

reducing the ability to distinguish any salient features in the data set;

and as m decreases, underlying physical processes that are not optimally

apparent at smaller values of m may be obscured.

This being said, in most current applications the parameter values of

choice are m

0.2*SD, which means we are counting templates

with a length of 2 to calculate B and templates with a length of 3 to

calculate A, and the tolerance for matches is set to 0.2 times the SD of the

process. In most cases, all readings in the observed sample are first

divided by the SD of that sample, so the SD of the sample becomes

exactly SD

¼

2 and t

¼

0.2. This preprocessing of the data

eliminates the influence of the variance of the sample on the irregularity

(or complexity) of the process, thus leaving SampEn to pick up only

characteristics strictly related to the sequential timing of the observations

and generally independent from the distribution of the observations.

More details, including the strict definition of SampEn, can be found in

Lake et al. (2002).

¼

1, in which case t

¼

r

¼

Although the computation of SampEn for long time series certainly

requires appropriate software (see Internet Resources at the end of the

chapter), one simple numerical example using the short sequences

considered earlier should clarify the template counting algorithm.

Example 6-1

.......................

For m

0.2, calculate the SampEn and SD for the sequences S1:

1,0,1,0,1,0,1,0,1,0 and S2: 1,1,1,0,0,1,0,0,0,1, and compare the results.

¼

2 and r

¼

We begin with the periodic series S1. The SD of this sample is 0.527.

Thus, for r

¼

0.2, the tolerance for similarity between two templates

would be t

2, all subsequences of

length 2 (beginning at up to N-m) in the series are 10,01,10,01,10,01,10,01.

Given a similarity tolerance of t

¼

(0.2)(0.527)

¼

0.1054. With m

¼

0.1054, two subsequences would

be matches only if they are identical. Thus, the total number of

template matches of length m

¼

2isB

¼

3

þ

3

þ

3

þ

3

þ

3

þ

3

þ

3

þ

3

¼

24

(each template 10 or 01 has exactly three matches, excluding self-

matches). All subsequences of length m

3 in the above series are

101,010,101,010,101,010,101,010. Thus, the total number of template

þ

1

¼

An Invitation to Biomathematics

Search WWH ::

Custom Search

Home