Biomedical Engineering Reference
In-Depth Information
whether the suffices a and b are contexts, i.e., whether the conditional distribu-
tion Pr( Y t +1 | Y t = a ) can be distinguished from Pr( Y t +1 | Y t = a , Y - t -1 ), and likewise for
Y t = b . It could happen that a is a context but b is not, in which case the algo-
rithm will try ab and bb , and so on. If one sets x t equal to the context at time t , x t
is a Markov chain. This is called a variable-length Markov model because the
contexts can be of different lengths.
Once a set of contexts has been found, they can be used for prediction. Each
context corresponds to a different distribution for one-step-ahead predictions,
and so one just needs to find the context of the current time series. One could
apply state-estimation techniques to find the context, but an easier solution is to
use the construction process of the contexts to build a decision tree (§2), where
the first level looks at Y t , the second at Y t -1 , and so forth.
Variable-length Markov models are conceptually simple, flexible, fast, and
frequently more accurate than other ways of approaching the symbolic dynamics
of experimental systems (101). However, not every regular language can be rep-
resented by a finite number of contexts. This weakness can be remedied by mov-
ing to a more powerful class of models, discussed next.
3.6.3. Causal-State Models, Observable-Operator Models,
and Predictive-State Representations
In discussing the state-space picture in §3.1 above, we saw that the state of
a system is basically defined by specifying its future time-evolution, to the ex-
tent that it can be specified. Viewed in this way, a state X t corresponds to a dis-
tribution over future observables Y t +1 + . One natural way of finding such
distributions is to look at the conditional distribution of the future observations,
given the previous history, i.e., Pr( Y t +1 + | Y t - = y t - ). For a given stochastic process or
dynamical system, there will be a certain characteristic family of such condi-
tional distributions. One can then consider the distribution-valued process gen-
erated by the original, observed process. It turns out that the former is always a
Markov process, and that the original process can be expressed as a function of
this Markov process plus noise. In fact, the distribution-valued process has all
the properties one would want of a state-space model of the observations
(48,49). The conditional distributions, then, can be treated as states.
This remarkable fact has led to techniques for modeling discrete-valued
time series, all of which attempt to capture the conditional-distribution states,
and all of which are strictly more powerful than VLMMs. There are at least
three: the causal-state models or causal-state machines (CSMs), 12 introduced
by Crutchfield and Young (102), the observable operator models (OOMs) in-
troduced by Jaeger (103), and the predictive state representations (PSRs) in-
troduced by Littman, Sutton, and Singh (104). The simplest way of thinking of
such objects is that they are VLMMs where a context or state can contain more
Search WWH ::




Custom Search