Exploiting Diversification in Gossip-Based Recommendation - Data Management in Cloud, Grid and P2P Systems

Database Reference

In-Depth Information

The reason why recall remains limited is because using relevance as the clus-

tering score introduces a significant amount of redundant user profiles in U-Net .

As a result, when a query is submitted, since many user profiles in U-Net are

quite similar ( i.e. redundant), and these users are chosen to provide recommen-

dations to answer the query, the probability of retrieving the same set of relevant

items increases and recall results remain low. In Information Retrieval ,useful-

ness is used as a way to overcome redundancy between the items of a result

list by combining relevance with diversity [8, 9]. In our context, we claim that

usefulness can be used when clustering user profiles in U-Net , instead of just

relevance. This way, a more diverse set of results will be returned from different

users and the recall would be enhanced.

In this paper, we propose a gossip-based search and recommendation approach

based on a new clustering score, called usefulness, that combines relevance and

diversity. As we show experimentally, this new score is able to increase signif-

icantly the quality of the recommendations returned by the system. However,

existing peer-to-peer clustering algorithms are no longer suitable since they are

optimized for relevance only. Therefore we also propose a new clustering algo-

rithm especially conceived for usefulness.

In summary, we make the following contributions:

1. We show that usefulness is a good way to increase recall and that it should

be expressed as a known probabilistic diversification score [8, 9].

2. We propose a clustering algorithm that maintains a useful U-Net over a

gossip overlay using the usefulness score.

3. We validate our approach with an experimental evaluation using three differ-

ent datasets: MovieLens , Flickr and LastFM . We observe that diversification

enables a huge increase of recall regardless of the relevance score used. Com-

pared with state of the art solutions, we obtain an excellent gain with recall

results up to three times better when using the notion of usefulness.

This paper is organized as follows. Section 2 provides some basic concepts

and gives the problem definition. In Section 3, we describe our new clustering

score and present in details the new clustering algorithm that maintains U-Net .

In Section 4, we provide an experimental evaluation. In Section 5, we compare

our contributions with related work. Finally, Section 6 concludes and provides

directions for future work.

2 Basic Concepts and Problem Definition

In this section, we introduce the background necessary to understand the prob-

lem we address. In our distributed search and recommendation approach, when-

ever a user u submits a query q , the system sends q to a subset of users that we

call U-Net , and who will return their relevant results to u and will also recur-

sively forward the query to the users in their U-Net until the TTL is reached.

To build U-Net , we use a two steps approach. First, based on random gossiping

each user u is aware of other peers available on the network. Second, by means

of a clustering algorithm, u chooses among these users the best ones to answer

u 's queries and keep them in U-Net .

Data Management in Cloud, Grid and P2P Systems

Search WWH ::

Custom Search

Home