Biomedical Engineering Reference
In-Depth Information
ity measures , describing the main kinds of complexity measure, their relation-
ships, and their applicability to empirical questions.
The chapter ends with a guide to further reading, organized by section.
These emphasize readable and thorough introductions and surveys over more
advanced or historically important contributions.
2.
STATISTICAL LEARNING AND DATA - MINING
Complex systems, we said, are those with many strongly interdependent
parts. Thanks to comparatively recent developments in statistics and machine
learning, it is now possible to infer reliable, predictive models from data, even
when the data concern thousands of strongly dependent variables. Such data
mining is now a routine part of many industries, and is increasingly important in
research. While not, of course, a substitute for devising valid theoretical models,
data mining can tell us what kinds of patterns are in the data, and so guide our
model-building.
2.1. Prediction and Model Selection
The basic goal of any kind of data mining is prediction: some variables, let
us call them X , are our inputs. The output is another variable or variables Y . We
wish to use X to predict Y , or, more exactly, we wish to build a machine which
will do the prediction for us: we will put in X at one end, and get a prediction for
Y out at the other. 2
"Prediction" here covers a lot of ground. If Y are simply other variables like
X , we sometimes call the problem regression . If they are X at another time, we
have forecasting , or prediction in a strict sense of the word. If Y indicates mem-
bership in some set of discrete categories, we have classification . Similarly, our
predictions for Y can take the form of distinct, particular values ( point predic-
tions ), of ranges or intervals we believe Y will fall into, or of entire probability
distributions for Y , i.e., guesses as to the conditional distribution Pr( Y | X ). One
can get a point prediction from a distribution by finding its mean or mode, so
distribution predictions are in a sense more complete, but they are also more
computationally expensive to make, and harder to make successfully.
Whatever kind of prediction problem we are attempting, and with whatever
kind of guesses we want our machine to make, we must be able to say whether
or not they are good guesses; in fact we must be able to say just how much bad
guesses cost us. That is, we need a loss function for predictions. 3 We suppose
that our machine has a number of knobs and dials we can adjust, and we refer to
these parameters, collectively, as R. The predictions we make, with inputs X and
parameters R, are f ( X ,R), and the loss from the error in these predictions, when
Search WWH ::




Custom Search