Breaking Dimensions: Adaptive Scoring with Sparse Grids - Realtime Data Mining

Database Reference

In-Depth Information

where x i represents the data points in the attribute space and y i the target attribute.

Assume now that these data have been obtained by sampling of an unknown

function f which belongs to some function space V defined over

d . The sampling

process was disturbed by noise. The aim is now to recover the function f from the

given data as faithfully as possible. We distinguish between classification , where

the target values y i are from a discrete set of classes, e.g., from { 1, +1} for binary

classification, and regression where y i are from a continuous spectrum. In what

follows we mainly focus on classification having in mind that sparse grids can be

used for regression, too [Gar06, Gar11]. In classification the function f is also called

classifier .

Scoring is increasingly used for personalization and may also be applied to

recommendations. An advantage of scoring is that we can include many attributes

characterizing the user behavior in x i . This may be user-centric attributes like age

and gender, transactional attributes like number of clicks or revenue, and many

other attribute types like time, channel, or even weather. The disadvantage of

scoring is the limited number of single attribute values it can handle in general.

This renders a direct application of scoring for recommendations of many products

virtually impossible.

There are different approaches to scoring-based recommendations. The most

simple is to use the recommendations as target attribute, i.e., each recommended

product corresponds to a target class. A more sophisticated approach is to use the

success of the session (revenue or in case of classification indicator of orders in the

session) as target attribute and the recommendation as a special set of control

attributes. Thus, in each recommendation step, we select the control attributes to

maximize f (x). (Note that depending on the function class of the classifier f , this

may result in a complex optimization problem. But this is not the main task of

scoring and hence will not be considered here.)

R

Example 7.1 Consider a small web shop. Suppose we need to select one of the

three on-site banners at each category and product page. Therefore, the banners

represent the control attribute and thus the recommendations. We further assume

that in each step of the session (product or category page view), a user is charac-

terized by four attributes: age, gender, number of clicks in current session, and how

many products are already in her/his basket. The target attribute is 0 if no order was

placed within the session and 1 if something was ordered.

Table 7.1 shows three sample sessions. In the first step of session A, the user is

considered to be unknown and hence his/her user-specific attributes age and gender

have missing values (represented by character “?”). In the first step, the banner b1

was recommended to his/her. We know from history that he/she has bought nothing

in this session, so the target attribute is 0 in all steps of the session. The second step

is very similar to the first one except that banner b3 was recommended. In the third

step, he/she added a product to his/her basket. In the fourth step, he/she signed in to

the shop and now his/her age and gender are considered to be known.

Session B represents a registered user, who was already recognized at the

beginning of the session, e.g., by a cookie. This user finally placed an order.

Realtime Data Mining

Search WWH ::

Custom Search

Home