Breaking Dimensions: Adaptive Scoring with Sparse Grids - Realtime Data Mining

Database Reference

In-Depth Information

Table 7.3 Results for the Ripley data set

Tenfold

Best

Level

Tenfold testing (%)

λ

On test data (%)

λ

Testing (%)

1

84.8

0.01005

89.8

0.00370

90.3

2

85.2

0.000001

90.4

0.00041

90.9

3

88.4

0.00166

90.6

0.00370

91.2

4

87.6

0.00248

90.6

0.01500

91.2

5

87.6

0.01005

90.9

0.00673

91.1

6

86.4

0.00673

90.8

0.00673

90.8

7

86.4

0.00075

88.5

0.00673

91.0

8

88.0

0.00166

89.7

0.00673

91.0

9

88.4

0.00203

90.9

0.00823

91.0

10

88.4

0.00166

90.6

0.00452

91.1

first two columns of Table 7.3 . With this

, we then compute the sparse grid

classifier from the 250 training data. Column 3 of Table 7.3 gives the result of

this classifier on the (previously unknown) test data set. We see that our method

works well. Already level 5 is sufficient to obtain results of 90.0 %. We also see

that there is not much need to use any higher levels. The reason is surely the

relative simplicity of the data. Just a few hyperplanes should be enough to

separate the classes quite properly. This is achieved with the sparse grid already

for a small number n.

Additionally, we give in Table 7.3 the testing correctness which is achieved for

the best possible λ . To this end we compute for all (discrete) values of λ the sparse

grid classifier from the 250 data points and evaluate them on the test set. We then

pick the best result. We clearly see that there is not much of a difference. This

indicates that our approach to determine the value of

λ

from the training set by

cross-validation works well. Note that a testing correctness of 90.6 % was achieved

with neural networks in [Rip94].

λ

■

7.3.2 High-Dimensional Problems

Example 7.4 The 10-dimensional data set ndcHArd consists of two million

instances synthetically generated and was first used in [MM01]. Here, the main

observations concern the run time.

In Table 7.4 we give the results using the combination technique with simplicial

basis functions as described in Sect. 7.2.6 . More than 50 % of the run time is spent

for the assembly of the data matrix. The time needed for the data matrix scales

linearly with the number of data points. The total run time seems to scale even

better than linearly. Already at level 1, we get 84.9 % testing correctness, and no

improvement with level 2 is achieved. Notice that with support vector machines,

correctness rates of 69.5 % were reported in [FM01].

Realtime Data Mining

Search WWH ::

Custom Search

Home