Database Reference
In-Depth Information
probability, we adopt the na¨ve Bayes assumption. We assume that the item attri-
butes in A
'
, for example, category and cast, are independent of each other. There-
fore, we adopt this approach and have
Pr
ð
R
U
¼
k
Þ
Pr
ð
A
1
;
A
2
; :::;
A
n
j
R
U
¼
k
Þ
Pr
R
U
¼
ð
k
j
A
¼
a
I
Þ ¼
Pr
ð
A
1
;
A
2
; :::;
A
n
Þ
Þ
Q
j¼n
j
Pr
ð
R
U
¼
k
1
Pr
ð
A
j
j
R
U
¼
k
Þ
¼
¼
A
¼f
A
1
;
A
2
; :::;
A
n
g
;
;
Pr
ð
A
1
;
A
2
; :::;
A
n
Þ
(4.4)
where Pr
A
0
1
;
A
0
2
; ...;
A
0
n
ð
Þ
can be treated as a normalizing constant, Pr(
R
U
¼
k
)is
A
0
J
j
the prior probability that
U
gives a rating
k
, and Pr
is the conditional
probability that each item attribute
A
0
J
in A
0
has a value
a
0
J
given
U
rated
k
; for
example, Pr(movie type
¼
drama
ð
R
U
¼
k
Þ
4). The last two probabilities can be
estimated from counting the review ratings of the target user
U
. Specifically,
j
R
U
¼
j
I
ð
R
U
¼
k
Þ
j þ
1
Pr
ð
R
U
¼
k
Þ¼
(4.5)
;
j
I
ð
U
Þ
j þ
n
and
þ
A
0
j
¼
a
0
j
;
I
ð
R
U
¼
k
Þ
1
A
j
¼
a
j
j
Pr
ð
R
U
¼
k
Þ¼
(4.6)
;
j
I
ð
R
U
¼
k
Þ
j þ
m
where
j
I(
U
)
j
is the number of reviews of user
U
in the training set,
j
I(
R
U
¼
k
)
j
is
A
j
¼
a
j
;
the number of reviews that user
U
gives a rating value
k
, and
is
the number of reviews to which
U
gives a rating value
k
while attribute
A
j
of the
corresponding target item has a value
a
j
. Notice that we insert an extra value 1 to
the numerators in both equations, and add
n
, the range of review ratings, to the
denominator in (4.5), and
m
, the range of
A
j
s values, to the denominator in (4.6).
This method is also known as the Laplace estimate, a well-known technique in
estimating probabilities [
21
], especially on a small size of training samples.
Because of the Laplace estimate, “strong” probabilities, like 0 or 1, from direct
probability computation can be avoided.
Moreover, in some cases when item attributes are not available, we can approxi-
mate Pr
j
I
ð
R
U
¼
k
Þj
A
0
¼
a
0
I
Þ
ð
R
U
¼
k
j
by the prior probability Pr(
R
U
¼
k
). Even though Pr
(
R
U
¼
k
) does not contain information specific to certain item attributes, it does
take into account
U
's general rating preference; for example, if
U
is a generous
person,
U
gives high ratings regardless of the items.
4.4.2.2
Item Likability Inference Engine
Pr(
R
I
¼
a
u
) captures the general likability of item
I
from users like user
U
.
For example, from a reviewer who is similar to Angela (e.g., the same gender and
age), how likely is it that “Revolutionary Road” will receive a rating of 5? Similar
k
j
A
¼
Search WWH ::
Custom Search