Database Reference
In-Depth Information
The reason why recall remains limited is because using relevance as the clus-
tering score introduces a significant amount of redundant user profiles in U-Net .
As a result, when a query is submitted, since many user profiles in U-Net are
quite similar ( i.e. redundant), and these users are chosen to provide recommen-
dations to answer the query, the probability of retrieving the same set of relevant
items increases and recall results remain low. In Information Retrieval ,useful-
ness is used as a way to overcome redundancy between the items of a result
list by combining relevance with diversity [8, 9]. In our context, we claim that
usefulness can be used when clustering user profiles in U-Net , instead of just
relevance. This way, a more diverse set of results will be returned from different
users and the recall would be enhanced.
In this paper, we propose a gossip-based search and recommendation approach
based on a new clustering score, called usefulness, that combines relevance and
diversity. As we show experimentally, this new score is able to increase signif-
icantly the quality of the recommendations returned by the system. However,
existing peer-to-peer clustering algorithms are no longer suitable since they are
optimized for relevance only. Therefore we also propose a new clustering algo-
rithm especially conceived for usefulness.
In summary, we make the following contributions:
1. We show that usefulness is a good way to increase recall and that it should
be expressed as a known probabilistic diversification score [8, 9].
2. We propose a clustering algorithm that maintains a useful U-Net over a
gossip overlay using the usefulness score.
3. We validate our approach with an experimental evaluation using three differ-
ent datasets: MovieLens , Flickr and LastFM . We observe that diversification
enables a huge increase of recall regardless of the relevance score used. Com-
pared with state of the art solutions, we obtain an excellent gain with recall
results up to three times better when using the notion of usefulness.
This paper is organized as follows. Section 2 provides some basic concepts
and gives the problem definition. In Section 3, we describe our new clustering
score and present in details the new clustering algorithm that maintains U-Net .
In Section 4, we provide an experimental evaluation. In Section 5, we compare
our contributions with related work. Finally, Section 6 concludes and provides
directions for future work.
2 Basic Concepts and Problem Definition
In this section, we introduce the background necessary to understand the prob-
lem we address. In our distributed search and recommendation approach, when-
ever a user u submits a query q , the system sends q to a subset of users that we
call U-Net , and who will return their relevant results to u and will also recur-
sively forward the query to the users in their U-Net until the TTL is reached.
To build U-Net , we use a two steps approach. First, based on random gossiping
each user u is aware of other peers available on the network. Second, by means
of a clustering algorithm, u chooses among these users the best ones to answer
u 's queries and keep them in U-Net .
 
Search WWH ::




Custom Search