Database Reference
In-Depth Information
probability, we adopt the na¨ve Bayes assumption. We assume that the item attri-
butes in A ' , for example, category and cast, are independent of each other. There-
fore, we adopt this approach and have
Pr
ð
R U ¼
k
Þ
Pr
ð
A 1 ;
A 2 ; :::;
A n j
R U ¼
k
Þ
Pr R U ¼
ð
k
j
A
¼
a I
Þ ¼
Pr
ð
A 1 ;
A 2 ; :::;
A n Þ
Þ Q j¼n
j
Pr
ð
R U ¼
k
1 Pr
ð
A j j
R U ¼
k
Þ
¼
¼
A
¼f
A 1 ;
A 2 ; :::;
A n g ;
;
Pr
ð
A 1 ;
A 2 ; :::;
A n Þ
(4.4)
where Pr A 0 1 ;
A 0 2 ; ...;
A 0 n
ð
Þ
can be treated as a normalizing constant, Pr( R U ¼
k )is
A 0 J j
the prior probability that U gives a rating k , and Pr
is the conditional
probability that each item attribute A 0 J in A 0 has a value a 0 J given U rated k ; for
example, Pr(movie type ¼ drama
ð
R U ¼
k
Þ
4). The last two probabilities can be
estimated from counting the review ratings of the target user U . Specifically,
j
R U ¼
j
I
ð
R U ¼
k
Þ
j þ
1
Pr
ð
R U ¼
k
Þ¼
(4.5)
;
j
I
ð
U
Þ
j þ
n
and
þ
A 0 j ¼
a 0 j ;
I
ð
R U ¼
k
Þ
1
A j ¼
a j j
Pr
ð
R U ¼
k
Þ¼
(4.6)
;
j
I
ð
R U ¼
k
Þ
j þ
m
where
j
I( U )
j
is the number of reviews of user U in the training set,
j
I( R U ¼
k )
j
is
A j ¼
a j ;
the number of reviews that user U gives a rating value k , and
is
the number of reviews to which U gives a rating value k while attribute A j of the
corresponding target item has a value a j . Notice that we insert an extra value 1 to
the numerators in both equations, and add n , the range of review ratings, to the
denominator in (4.5), and m , the range of A j s values, to the denominator in (4.6).
This method is also known as the Laplace estimate, a well-known technique in
estimating probabilities [ 21 ], especially on a small size of training samples.
Because of the Laplace estimate, “strong” probabilities, like 0 or 1, from direct
probability computation can be avoided.
Moreover, in some cases when item attributes are not available, we can approxi-
mate Pr
j
I
ð
R U ¼
k
Þj
A 0 ¼
a 0 I Þ
ð
R U ¼
k
j
by the prior probability Pr( R U ¼
k ). Even though Pr
( R U ¼
k ) does not contain information specific to certain item attributes, it does
take into account U 's general rating preference; for example, if U is a generous
person, U gives high ratings regardless of the items.
4.4.2.2
Item Likability Inference Engine
Pr( R I ¼
a u ) captures the general likability of item I from users like user U .
For example, from a reviewer who is similar to Angela (e.g., the same gender and
age), how likely is it that “Revolutionary Road” will receive a rating of 5? Similar
k
j
A
¼
Search WWH ::




Custom Search