Database Reference
In-Depth Information
a
b
Transaction
data
Transaction
data
Simulation
phase 1
Simulation
Model of
environment
RE
algorithm
p ss ' ,
r ss '
a
ss
p
' , ss
r
'
Static
model of
environment
Simulation
phase 2
q
(
s
,
a
)
Action-value
function
RE
algorithm
a
ss
p
' , ss
r
'
Model of
environment
q ( s , a )
Action-value
function
Fig. 5.6 Diagram of both simulation types of Sects. 4.4 and 5.4.2
5.4.3 Experimental Results
Example 5.4 We start our virtual simulation with an artificial example before we
turn to a real-life data set.
To do so, we consider a small shop with only 6 products 1-6. We use the
reward1forclicksand1+ pr if the product was added to the basket, where pr is
the price of the product. The product prices of our mini shop are listed in
Table 5.5 .
Suppose there have been the following 4 sessions (star indicates that the product
has been added to the basket after it was viewed):
 
Search WWH ::




Custom Search