Database Reference
In-Depth Information
Fig. 6.7 Illustration
of the update logic of
the preconditioner C 1
a 2
a 3
(
s
3 ,
a
3 )
a 0
( s 2 , a 0 )
s
3
a 1
(
s 0 ,
a 1 )
(
s 0 ,
a 0 )
s
2
s
1
a 0 and associated
actions: H
s
(
s
,
a
)
4
1
s
4
s 0 and associated states: G
This results from the reflexive coarse grid action y 1 of the group y 1 on itself.
Of course the reflexive update for x 1 is especially strong here.
An update via the action x 1 leads to an update of the actions x 1 themselves
and also x 2 :
x 1 ,
1
2
1
2 x 1 :
x 1 ¼
x 2 ¼
e
1 þ
e
It is the coarse grid action y 1 of the group y 1 on y 2 that is responsible for this.
Figure 6.7 illustrates the general logic of the updates using the example of the
action ( s 0 , a 0 ).
An update of the rule ( s 0 , a 0 ) therefore leads not only to the update of the rule
itself but also to the update of all rules in the same state group G of the initial
product s 0 into the same action group H of the recommended product a 0 .
From a technical point of view, there is another positive aspect: when the
preconditioner C 1 updates an action value for the state-action pair ( s , a ), even
though for ( s , a ) still no rule exists, it can be generated automatically. In this way
the hierarchical preconditioner automatically generates new recommendations for
products without recommendations (due to a lack or too little transaction history).
We will also stress the subject into the next section.
6.3 Learning on Category Level
So far we have considered the hierarchical RL only under the aspect of the
acceleration of convergence. However, as we mentioned at the end of the last
section, it can also be used for a further task: raising the recommendation coverage.
Search WWH ::




Custom Search