Database Reference
In-Depth Information
Conversely, it may occur, though rather seldom in practice, that the response to
the recommendation a leads to a decrease in the associated transition probability
(for instance, by cannibalizing multiple product recommendations). We then have
p ss a <<
1: the unconditional action value would be
extremely strongly weighted - an apparently absurd effect. However, it should be
p ss a
and hence c ( s , a )
>>
noted here that because of the relationship X
s 0 6¼s a
p ss 0 ¼ 1 p ss a , a large p ss a
leads to a
small unconditional action value, and we must perform a limit value consideration
here. We will explore this in more depth in the course of the special cases of ( 5.6 ).
For a quantitatively better understanding of ( 5.6 ), let us consider for the product
s the difference between the action values of two recommendations a and b:
ð X
s 0 6¼s a
ð X
s 0 6¼s b
q π s
ðÞq π s
ðÞ¼p ss a r ss a þ cs
p ss 0 r ss 0 p ss b r ss b cs
;
;
;
;
p ss 0 r ss 0
¼ p ss a r ss a p ss b r ss b þ cs
ð p ss b r ss b cs
;
ð p ss a r ss a
;
ðÞ X
s 0 6¼s a , s b
ðÞ X
s 0 6¼s a , s b
þ cs
;
p ss 0 r ss 0 cs
;
p ss 0 r ss 0
h
i r ss a p ss b cs
h
i r ss b þ cs
X
s 0 6¼s a , s b
¼ p ss a cs
ð p ss a
;
ð p ss b
;
½
ðÞcs
;
ðÞ
;
p ss 0 r ss 0 :
1 p ssa
By preliminary use of the estimate cs
ðÞ¼
;
1 p ssa 1 and similarly c ( s , b ) 1,
we obtain
h
i
h
i
q π s
ðÞq π s
ðÞp ss a p ss a
p ss b p ss b
p a r a Δ
p b r b : ð 5
;
;
r ss a
|{z}
r a
r ss b
|{z}
r b
¼Δ
:
7 Þ
| {z }
Δ
| {z }
Δp b
p a
If, for the sake of simplicity, we initially set all rewards to 1, we have
q π s
ðÞq π s
p a
p b
;
ðÞΔ
;
Δ
:
Since we can generally assume that for a product s the probability of a product
transition to a product s y is higher if y is recommended, we have
p ss y >
p ss y ,
and we obtain the following interpretation. The recommendation a is then certainly
better than the recommendation b if the difference
p a between the transition prob-
abilities increased by the product recommendation is greater than the similar differ-
ence Δ
Δ
p b for the recommendation b . Instead therefore of making recommendations
a with the highest transition probability p a (: ¼ p ss a ) as in classical data mining, the
recommendations a that are made are those with the highest difference between the
Search WWH ::




Custom Search