Database Reference
In-Depth Information
are often willing to share their data with other users in a community of interest.
However, the fact that their data spaces are distributed in many different systems
makes data sharing especially dicult. For instance, an artist photographer who
wants to share her pictures within an online community of photographers may
have to log in several different Web applications such as deviantArt, Facebook
or Flickr, each with a different interface and account. Similarly, a scientist who
needs to search for scientific datasets within an online community of scientists
will be faced with the problem that the relevant data is typically distributed
in many different labs' servers or scientists' local computers. Furthermore, since
this data is hidden to web crawlers, traditional search engines become useless.
In order to mitigate this problem, some Web applications allow grouping several
accounts and data from different systems ( e.g. Facebook enables to regroup
DropBox and blogs into a single Facebook account). However, they are limited
to a few well-known systems.
In this context of large scale distribution of users and data, a general solution
to data sharing is offered by distributed search and recommendation [1, 2]. In
this paper, we adopt a peer-to-peer gossip-based approach, because it provides
important properties such as scalability, dynamicity, autonomy and decentralized
control. Within an online community, each user u is associated to a virtual data
space that contains all the data items (stored in different systems) it shares.
Given u and a keyword query q , the goal of our search and recommendation
approach is to recommend to u items that are relevant with respect to q and
that are shared by other users, regardless of the systems that store the items.
Then, a recommended item is simply a reference that can be used to retrieve
the actual data item. In other words, we combine search and recommendation
in the sense that a user u searches relevant items among those recommended by
users similar to u .
Distributed search and recommendation has received considerable attention
[1-4]. However, one open problem is the ability to attain high recall results. A
query is generally forwarded only to a subset of users who will be employed to
process queries and return recommendations. To compute this subset of users,
many solutions cluster relevant user profiles implicitly using gossip protocols.
Gossip protocols are known to be highly resilient, scalable and converge quickly
[5], which makes them a good alternative for distributed search and recommen-
dation. A User Network ( U-Net in the following) refers to the cluster of relevant
users, a user u is aware of by gossiping, using a score ( e.g. similarity between u
and the users in U-Net ). At each gossip round, the most relevant users are kept
in U-Net .Since U-Net is used to guide recommendations given a keyword query,
the relevance score used in the clustering process plays a very important role to
increase the number of relevant items retrieved with respect to the whole set of
items ( i.e. recall), known as the global corpus.
Relevance scores ( e.g. Jaccard, overlap) define how well a user profile v meets
the needs of another user u . Most of the existing solutions exploit different kinds
of relevance scores to increase recall [2-4, 6, 7]. But recall results remain limited.
Search WWH ::




Custom Search