Breaking Dimensions: Adaptive Scoring with Sparse Grids - Realtime Data Mining

Database Reference

In-Depth Information

For practical applications, we can start with Algorithm 7.1 on historic data S and

then apply Algorithm 7.3 on new data points

^

s . This is again a nice example of

combining offline and online learning.

7.2.6 Further Sparse Grid Versions

There are many different versions and modifications of the sparse grid approach.

We want to explain some of them which may be relevant for scoring and reinforce-

ment learning.

7.2.6.1 Simplicial Basis Functions

So far we only mentioned d -linear basis functions based on a tensor-product

approach. But on the grids of the combination technique, linear basis functions

based on simplicial discretization are also possible. Here, the so-called Kuhn

triangulation for each rectangular block is used [Kuh60]. Now the summation of

the discrete functions for the different spaces V l in ( 7.27 ) only involves linear

interpolation.

The theoretical properties of this variant of the sparse grid technique still have

to be investigated in more detail. However, the results warrant its use. There are,

if at all, just slightly worse results with linear basis functions than with d -linear

basis functions, and we believe that this new approach results in the same approx-

imation order.

Since in the new variant of the combination technique, the overlap of supports,

i.e., the regions where two basis functions are both nonzero, is greatly reduced due

to the use of a simplicial discretization, the complexities scale significantly better.

By using simplicial basis functions, sparse grid classification can be performed with

up to 20-22 dimensions on a conventional PC. For details about the simplicial basis

functions for sparse grids, we refer to [GG01a].

7.2.6.2 Anisotropic Sparse Grids

Up to now we treated all attributes of the classification problem the same, i.e., we

used the same mesh refinement level for all attributes. Obviously attributes have

different properties, different number of distinct values, and different variances.

For example, to discretize the range of a binary attribute, one does not need more

than two grid points.

We can generalize our approach to account for such situations as well. We use

different mesh sizes for each dimension along the lines of [GG98]. This results

in a so-called anisotropic sparse grid. Now different refinement level n j for each

dimension i , j ¼ 1,

...

, d can be given instead of only the same refinement level

Search WWH ::

Custom Search

Home