Breaking Dimensions: Adaptive Scoring with Sparse Grids - Realtime Data Mining

Database Reference

In-Depth Information

adaptive multivariate regression splines, neural networks, support vector machines,

and regularization networks. Interestingly, the latter techniques can be interpreted

in the framework of regularization networks [GJP95]. With these techniques, it is

possible to treat quite high-dimensional problems, but the amount of data is limited

due to complexity reasons. This situation is reversed in many practical applications

such as those of recommendations presented here where the dimension of the

resulting problem is moderate but the amount of data is usually huge. Thus, there

is a strong need for methods which can be applied in this situation as well.

We will see that sparse grids can cope with the complexity of the problem,

at least to some extent. Moreover, sparse grids are perfectly suited for data

adaptivity and show many other advantages. They represent a very modern

approach to scoring.

7.2 The Sparse Grid Approach

Classification of data can be interpreted as traditional scattered data approximation

problem with certain additional regularization terms. In contrast to conventional

scattered data approximation applications, we now encounter quite high-

dimensional spaces. To this end, the approach of regularization networks [GJP95]

gives a good framework. The approach allows a direct description of the most

popular neural networks, and it also allows for an equivalent description of support

vector machines and n -term approximation schemes [EPP00, Gir98].

We start with the scoring problem for a given data set S described at the

beginning of this chapter. This is clearly an ill-posed problem since there are

infinitely many solutions possible. To get a well-posed, uniquely solvable problem,

we have to assume further knowledge on f . To this end, regularization theory

[TA77, Wah90] imposes an additional smoothness constraint on the solution of

the approximation problem, and the regularization network approach considers the

variation problem

min

f ∈ V RðÞ

with

M X

M

1

RðÞ¼

Cf x ðÞ , y i

ð

Þ þ λΦ

ðÞ:

ð 7

:

1 Þ

i¼ 1

Here, C(.,.) denotes an error cost function which measures the interpolation error,

and

V . The first

term enforces closeness of f to the data, the second term enforces smoothness of f ,

and the regularization parameter

Φ

( f ) is a smoothness functional which must be well defined for f

∈

λ

balances between these two terms. Typical

examples are

Realtime Data Mining

Search WWH ::

Custom Search

Home