METHODS AND TECHNIQUES OF COMPLEX SYSTEMS SCIENCE: AN OVERVIEW - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

specific hypotheses about specific models, such as the presence of interactions

between certain variables.) This is facilitated by the development of extremely

flexible classes of models, which are sometimes, misleadingly, called non-

parametric ; a better name would be megaparametric . The idea behind mega-

parametric models is that they should be capable of approximating any func-

tion, at least any well-behaved function, to any desired accuracy, given enough

capacity.

The polynomials are a familiar example of a class of functions which can

perform such universal approximation. Given any smooth function f , we can

represent it by taking the Taylor series around our favorite point x 0 . Truncating

that series gives an approximation to f :

k

d

(

xx df

)

k

=

fx

()

fx

( )

0

[7]

0

k

!

dx

k

=

1

x

0

k

n

(

xx df

)

k

+

x

fx

()

0

[8]

0

k

!

dx

k

=

1

x

0

n

(

xx

)

k

=

a

0

.

[9]

k

!

k

=

0

In fact, if f is an n th-order polynomial, the truncated series is exact, not an ap-

proximation.

To see why this is not a reason to use only polynomial models, think about

what would happen if f ( x ) = sin x . We would need an infinite -order polynomial

to completely represent f , and the generalization properties of finite-order ap-

proximations would generally be lousy: for one thing, f is bounded between -1

and 1 everywhere, but any finite-order polynomial will start to zoom off to or

- outside some range. Of course, this f would be really easy to approximate as

a superposition of sines and cosines, which is another class of functions which is

capable of universal approximation (better known, perhaps, as Fourier analysis).

What one wants, naturally, is to choose a model class which gives a good ap-

proximation of the function at hand, at low order . We want low-order functions,

both because computational demands rise with model order and because higher-

order models are more prone to over-fitting (VC dimension generally rises with

model order).

To adequately describe all of the common model classes, or model archi-

tectures , used in the data mining literature would require another chapter ((31)

and (32) are good for this.) Instead, I will merely name a few.

Splines are piecewise polynomials, good for regression on bounded do-

mains; there is a very elegant theory for their estimation (33).

Complex Systems Science in Biomedicine

Search WWH ::

Custom Search

Home