Database Reference
In-Depth Information
This stability result means that the less our model assumptions are violated, the
better our observed probabilities can be estimated. Similar reasoning applies to the
methods for the estimation of transition probabilities presented in Sect. 5.2.3 .
4.3 Remarks on the Modeling
In what follows, we would like to study the remarks from Sect. 3.9 with regard to
the above devised model of recommendation engines.
As we are dealing with recommendation engines with episodic tasks, the question
for the terminal state arises. Indeed, we have already seen the latter in Figs. 4.2 and
4.3 . It is, indeed, meaningful for two reasons: first, it has a clear interpretation with
regard to content, as it assigns to each product the probability that a user terminates
the session afterward, i.e., leaves the shop. Second, it is relevant for the computation
of transition probabilities. According to (3.2), these must sum up to one, as P is
stochastic. Indeed, one could ignore the end of the session when computing P , but
this would result in adulterated transition probabilities. Since the latter are multi-
plied with the rewards in the Bellman equation ( 3.4 ), their actual magnitude matters.
As opposed to all the other states, which represent products, there is, of course,
no action affiliated to the terminal state - since this would mean to suggest the user
to leave the shop.
Another issue is the question of whether it is meaningful to consider recommen-
dations from products to themselves, i.e., p ss . This corresponds to the representation
of rules of the form s ! s. We remind the reader that this is a sufficient condition for
primitivity of the matrix P (together with irreducibility, which we shall address later
on). At the first glance, these rules do not convey much information; they only
signify that the user repeatedly calls the product up, i.e., hits the refresh button.
(This is different when we operate on the level of categories as in Chap. 6 .) On the
other hand, they are, for the same reasons as the terminal state, relevant with regard
to the computation of transition probabilities. Hence, the internal usage of these
transitions is recommendable. They must, however, not serve as recommendations,
as they would give rise to products recommending themselves.
Finally, let us turn to the question of irreducibility. In most practical applica-
tions, it does not hold. In Chaps. 6 , 7 , 8 , and 9 on hierarchical methods and
factorizations, we shall, however, deal with procedures that enable to compute an
almost unlimited amount of recommendations for each product, i.e., transitions
satisfying p ss 0 >
0. This may easily be exploited to render P irreducible. At the
same time, irreducibility may also have positive effects since it decomposes the
global problem into uncoupled subproblems. An example is Theorem 6.1 about
the convergence of the multigrid method. Thus, the value of irreducibility has to
be checked depending on the used method.
Let us summarize: it is reasonable to include the terminal state in the model of
the RE. So is it, in general, to capture cycles of length 1. Invoking special tools, it is
possible to ensure that P be irreducible. Thus, the essential conditions for conver-
gence of the TD algorithm are satisfied.
Search WWH ::




Custom Search