Decomposition in Transition: Adaptive Matrix Factorization - Realtime Data Mining

Database Reference

In-Depth Information

Table 8.5 Comparison of prediction qualities and error norms for different regularization param-

eter values and with variable rank

λ ¼ 0.1

λ ¼ 0.01

λ ¼ 0.001

p 1

p 3

e F

e Ω

p 1

p 3

e F

e Ω

p 1

p 3

e F

e Ω

1.18

2.11

13.22

3.62

0.20

1.22

34.52

3.29

0.02

0.32

55.98

3.28

2.66

4.83

11.78

3.33

0.96

3.66

29.77

2.93

2.05

3.66

41.52

2.89

5.74

9.16

8.13

1.94

5.77

8.21

20.47

0.63

5.41

7.84

27.02

0.55

100

6.13

9.75

7.32

1.80

6.29

9.84

17.62

0.28

6.15

9.12

21.82

0.12

200

6.09

9.86

7.11

1.79

6.32

10.02

16.64

0.28

6.32

9.93

18.06

0.12

previous approach. But, on the contrary, we may also argue that there is too little

statistical volume and the user cannot view all products that he/she is potentially

interested in, simply because there are too many of them. This would suggest the

matrix completion approach. So there are pros and cons for both assumptions.

Example 8.8 We next repeat the test of Example 8.6 with the factorization

according to formulation ( 8.37 ). Instead of a gradient descent algorithm, we used

an ALS algorithm as described in [ZWSP08] which is more robust.

The results are contained in Table 8.5 whose structure basically corresponds to

that of Table 8.3 . Instead of the time, we have included the error norm e Ω which

corresponds to the Frobenius norm e F but is calculated only on the given entries

( i , j )

∈ Ω

. Thus, e Ω is equal to the root-mean-square error (RMSE) multiplied by

the square root of the number of entries

j p . Additionally, we compare different

values of the regularization parameter

From Table 8.5 we see that for increasing rank the RMSE is strongly declining,

and we capture the given probabilities on

almost perfectly. A rank of 50-100 is

already sufficient to bring the RMSE close to zero, and higher ranks also do not

improve the prediction rate significantly. Unlike in Table 8.3 , where we needed

almost full rank to zero the approximation error, this is because here we just have to

approximate the unknown entries.

In contrast to Table 8.3 , the overall error e F is slowly decreasing and remains

very high. This is again because we do not approximate the zero values outside

The prediction rate is comparable to Table 8.3 . This indicates that for the proba-

bility matrix P , the approach to consider all non-visited entries to be zero is equally

reasonable like assuming them to be unknown.

■

The result of Example 8.8 does not mean that the matrix completion approach

is outright useless for the recommendation engine task. In fact, it could be, e.g.,

used to complete the matrix of transactions or transitions before it is further

processed.

Realtime Data Mining

Search WWH ::

Custom Search

Home