METHODS AND TECHNIQUES OF COMPLEX SYSTEMS SCIENCE: AN OVERVIEW - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

ity measures , describing the main kinds of complexity measure, their relation-

ships, and their applicability to empirical questions.

The chapter ends with a guide to further reading, organized by section.

These emphasize readable and thorough introductions and surveys over more

advanced or historically important contributions.

2.

STATISTICAL LEARNING AND DATA - MINING

Complex systems, we said, are those with many strongly interdependent

parts. Thanks to comparatively recent developments in statistics and machine

learning, it is now possible to infer reliable, predictive models from data, even

when the data concern thousands of strongly dependent variables. Such data

mining is now a routine part of many industries, and is increasingly important in

research. While not, of course, a substitute for devising valid theoretical models,

data mining can tell us what kinds of patterns are in the data, and so guide our

model-building.

2.1. Prediction and Model Selection

The basic goal of any kind of data mining is prediction: some variables, let

us call them X , are our inputs. The output is another variable or variables Y . We

wish to use X to predict Y , or, more exactly, we wish to build a machine which

will do the prediction for us: we will put in X at one end, and get a prediction for

Y out at the other. 2

"Prediction" here covers a lot of ground. If Y are simply other variables like

X , we sometimes call the problem regression . If they are X at another time, we

have forecasting , or prediction in a strict sense of the word. If Y indicates mem-

bership in some set of discrete categories, we have classification . Similarly, our

predictions for Y can take the form of distinct, particular values ( point predic-

tions ), of ranges or intervals we believe Y will fall into, or of entire probability

distributions for Y , i.e., guesses as to the conditional distribution Pr( Y | X ). One

can get a point prediction from a distribution by finding its mean or mode, so

distribution predictions are in a sense more complete, but they are also more

computationally expensive to make, and harder to make successfully.

Whatever kind of prediction problem we are attempting, and with whatever

kind of guesses we want our machine to make, we must be able to say whether

or not they are good guesses; in fact we must be able to say just how much bad

guesses cost us. That is, we need a loss function for predictions. 3 We suppose

that our machine has a number of knobs and dials we can adjust, and we refer to

these parameters, collectively, as R. The predictions we make, with inputs X and

parameters R, are f ( X ,R), and the loss from the error in these predictions, when

Search WWH ::

Custom Search

Home