METHODS AND TECHNIQUES OF COMPLEX SYSTEMS SCIENCE: AN OVERVIEW - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

Neural networks or multilayer perceptrons have a devoted following,

both for regression and classification (32). The application of VC theory to them

is quite well-advanced (34,35), but there are many other approaches, including

ones based on statistical mechanics (36). It is notoriously hard to understand

why they make the predictions they do.

Classification and regression trees (CART), introduced in the topic of that

name (37), recursively subdivide the input space, rather like the game of "twenty

questions" ("Is the temperature above 20 centigrade? If so, is the glucose con-

centration above one millimole?," etc.); each question is a branch of the tree. All

the cases at the end of one branch of the tree are treated equivalently. The result-

ing decision trees are easy to understand, and often similar to human decision

heuristics (38).

Kernel machines (22,39) apply nonlinear transformations to the

input, mapping it to a much higher dimensional "feature space," where they

apply linear prediction methods. This trick works because the VC dimension of

linear methods is low, even in high-dimensional spaces. Kernel methods come

in many flavors, of which the most popular, currently, are support vector

machines (40).

2.2.1.

Predictive Versus Causal Models

Predictive and descriptive models both are not necessarily causal. PAC-type

results give us reliable prediction, assuming future data will come from the same

distribution as the past. In a causal model, however, we want to know how

changes will propagate through the system. One difficulty is that these relation-

ships are one-way, whereas prediction is two-way (one can predict genetic vari-

ants from metabolic rates, but one cannot change genes by changing

metabolism). The other is that it is hard (if not impossible) to tell if the predic-

tive relationships we have found are confounded by the influence of other vari-

ables and other relationships we have neglected. Despite these difficulties, the

subject of causal inference from data is currently a very active area of research,

and many methods have been proposed, generally under assumptions about the

absence of feedback (41-43). When we have a causal or generative model, we

can use very well-established techniques to infer the values of the hidden or la-

tent variables in the model from the values of their observed effects (41,44).

2.3. Occam's Razor and Complexity in Prediction

Often, regularization methods are thought to be penalizing the complexity of

the model, and so implementing some version of Occam's Razor. Just as Occam

said "entities are not to be multiplied beyond necessity," 8 we say "parameters

Search WWH ::

Custom Search

Home