Exploiting Diversification in Gossip-Based Recommendation - Data Management in Cloud, Grid and P2P Systems

Database Reference

In-Depth Information

To evaluate the quality of search and recommendation, we use the recall mea-

sure [12]. Recall captures the fraction of items that have been successfully rec-

ommended: recall =

,where Iret q ∈ I refers to the relevant

items recommended with respect to a query q ,and Irel q ∈ I refers to all the

relevant items with respect to query q .

|Iret q | / |Irel q |

Problem Definition: Given a user u ∈ U ,aquery q , I in G , and a gossip based

overlay, the goal is to maximize the number of relevant items with respect to q

returned to u while minimizing TTL .

3 Diversified Clustering and Algorithm

In this section, we show that usefulness is an excellent way to increase recall

results of gossip-based recommendation, and can be used as a clustering score.

In section 3.1, we formally show that to increase recall, usefulness should take

into account relevance and diversity. Next, Section 3.2 presents the Useful U-Net

clustering algorithm deployed over a gossip-based overlay.

3.1 Usefulness Score

The usefulness score should be designed such that it maximizes the probability

that a user u can retrieve relevant items given a random query q ,knownas

the coverage probability. In other words, u 's neighbors v 1 , ..., v n ∈ G should be

chosen such that the number of relevant items (with respect to the queries u will

submit) that can be accessed through them is maximized.

Let Q be the set of all possible queries (all the combinations of terms), and

P ( Q v ) the probability that a user v can return at least one relevant item given a

random query q ∈ Q . In the following, we first define the coverage with respect

to U-Net u =

. Then, based on coverage, we express the usefulness of

auser v with respect to the other users in u 's U-Net .

{v 1 , ..., v n }

Definition 2 (Coverage). Given Q and U-Net u =

,theusersin u 's

U-Net. The coverage is the probability that at least one of the user in u 's U-Net

can return at least one item given a random query q ∈ Q . Coverage is denoted

P ( Q v 1 ∪ Q v 2 ∪ ... ∪ Q v n ) .

The user profiles v 1 , ..., v n must be selected such that the coverage probability is

maximized. Formula 3 develops the coverage probability with respect to every

user in u 's U-Net .

P ( Q v 1 ∪ ... ∪ Q v n )=

j∈ 1 ,...,n

{v 1 , ..., v n }

( P ( Q v j ) − P ( Q v j ∩ ( Q v j +1 ∪ ... ∪ Q v n )))

(3)

( Q v j +1 ∪ ... ∪ Q v n )) represents the coverage added by user

v j with respect to the users v j +1 , ..., v n . As a consequence, when j = n ,only

P ( Q v j ) is considered as there is no more user profiles to compare with.

In the following, we define the usefulness of a user profile v i with respect to

the coverage probability.

P ( Q v j )

− P ( Q v j ∩

Definition 3 (Usefulness). Given u 's U-Net, the usefulness of a user profile

v j is the probability that it can return relevant items for a random query q , that

Data Management in Cloud, Grid and P2P Systems

Search WWH ::

Custom Search

Home