METHODS AND TECHNIQUES OF COMPLEX SYSTEMS SCIENCE: AN OVERVIEW - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

Figure 3 . Empirical loss and generalization loss as a function of model complexity.

methods penalize the "roughness" of a model, i.e., some measure of how much

the prediction shifts with a small change in either the input or the parameters

(26, ch. 10). A smooth function is less flexible, and so has less ability to match

meaningless wiggles in the data. Another popular penalty method, the minimum

description length principle of Rissanen, will be dealt with in §8.3 below.

Usually, regularization methods are justified by the idea that models can be

more or less complex, and more complex ones are more liable to over-fit, all

else being equal, so penalty terms should reflect complexity (Figure 3). There's

something to this idea, but the usual way of putting it does not really work; see

§2.3 below.

2.1.3.

Capacity Control

Empirical risk minimization, we said, is apt to over-fit because we do not

know the generalization errors, just the empirical errors. This would not be such

a problem if we could guarantee that the in-sample performance was close to

the out-of-sample performance. Even if the exact machine we got this way was

not particularly close to the optimal machine, we'd then be guaranteed that our

predictions were nearly optimal. We do not even need to guarantee that all the

Search WWH ::

Custom Search

Home