Decomposition in Transition: Adaptive Matrix Factorization - Realtime Data Mining

Database Reference

In-Depth Information

Table 8.3 Comparison of prediction qualities and error norms of SVD, LVP, and NMF with

variable rank

SVD

LVP

NMF

p 1

p 3

e F

p 1

p 3

e F

t 1

p 3

e F

1.98

4.47

4.64

0.77

2.59

4.73

1.88

4.28

4.64

2.51

5.61

4.50

2.43

5.31

4.59

2.22

6.37

4.51

5.35

9.55

3.44

199

5.14

9.54

3.74

5.14

8.56

3.52

100

5.75

10.06

2.73

378

5.69

10.00

3.13

5.50

9.22

2.98

200

6.27

10.69

1.88

757

6.12

10.49

2.19

6.09

10.31

2.36

185

500

6.35

10.32

0.74

5168

6.35

10.35

0.75

6.35

10.51

2.53

752

558 (full)

6.37

10.33

8.4.4 Experimental Results

In the following, we shall apply the above-described factorization procedures to

predict product transitions and evaluate the results. To be able to employ the

nonadaptive procedures from this chapter, we divide each of the data sets into a

training and a test set.

We shall first address factorization of the transition probabilities P . To this end,

we first compute the matrix P from the training data. Then, we factorize it and

subsequently use the factorized matrix P k to predict the considered products of the

test set. To do so, we traverse all sessions of the test set step by step and recommend

for each considered product s the product with the highest transition probability in

P k , i.e., arg max

s 0

pð ss 0 , which we compare with the immediate successor product.

Correspondingly, we also recommend the three strongest products to inquire the

case of multiple recommendations. This so corresponds to the Markov approach

from the previous chapters.

Example 8.6 We consider the shop from Example 8.5, while taking all transactions

and its entire product assortment (2671 products) into account. The training and test

set both consist of 134,832 sessions.

For the factorization, we use each of the three methods presented in this section:

the truncated SVD ( 8.14 ), computed by means of the Lanczos Algorithm 8.2 with

Ritz projections; the Lanczos vector projection (LVP), i.e., Algorithm 8.2 with left-

projection approximation ( 8.25 ); and the NMF according to the ALS Algorithm 8.3.

The latter has always been carried out with 100 iterations.

The result is displayed in Table 8.3 .Here, k denotes the rank, p 1 and p 3 are

the rates of correct prediction for 1 and 3 recommendations, respectively, e F : ¼

kP P k k F the Frobenius norm of the approximation error, and t the computing time

(in seconds). The last row contains the results for usage of the complete matrix P .

The result is rather disappointing and may be summarized as follows: none of

the three approaches to factorization of transition probabilities is really sensible.

A ridiculously high rank is necessary to achieve a quality of prediction that is

comparable with that attained when using the nonfactorized matrix P .

Realtime Data Mining

Search WWH ::

Custom Search

Home