Exploiting Diversification in Gossip-Based Recommendation - Data Management in Cloud, Grid and P2P Systems

Database Reference

In-Depth Information

More precisely, our peer-to-peer model is expressed based on a graph G =

( U,I,E ), where U =

{u 1 , ..., u n }

is the set of users distributed over the network,

I =

{i 1 , ..., i m }

the set of shared data items (in the following, an item refers

to a data item), and E =

the set of directed edges among users

and between users and items. This model is very generic. In our case, users are

independent nodes in the network. A node can be a physical computer or a

virtual node in a server.

{e 1 , ..., e k }

Definition 1 ( U-Net ). Given a user u , its User Network, or U-Net, refers to

the cluster of relevant users u is aware of. There is an edge e ( u,v ) in the graph

between u and a user v ,if v is in u 's U-Net.

With random gossiping [5], each user keeps locally a random view of its dy-

namic acquaintances (or view entries). Each view entry corresponds to a user

profile. Periodically, each user chooses randomly a contact (view entry) to gossip

with. The two involved users then exchange a subset of each others view ( i.e.

user profiles), and update their view state. Then, after each gossip exchange, the

random view is used to update the U-Net if more relevant profiles are found in

the updated view. We use Jaccard as the relevance score to select the best users:

Jaccard ( u , v )= |I u ∩ I v | / |I u ∪ I v |

(1)

Where I u and I v are the items shared by user u and v respectively.

Here, we use the vector space model to represent items and user profiles [10].

Specifically, each item it is modelled as a sparse vector containing only the

weights of the keywords k 1 , ..., k z in it. The weight of each keyword is computed

using tf

idf can be easily implemented using gossip pro-

tocols. Indeed, the first part of the score, denoted tf , can be computed locally,

and the second part, denoted idf , only needs average information ( e.g. average

number of items per user) to be computed. These averages can be easily com-

puted using gossip protocols [11]. Due to lack of space, we do not develop this

protocol in this paper.

Each user profile is defined based on the items the user shares, I u ( i.e. con-

tent based recommendation). We choose a relevance score ( i.e. Jaccard )that

works well with content-based recommendation, but other relevance measures

and profiles definition methods could be used as well.

As mentioned before, whenever a user u submits a keywords query q =

k 1 , ..., k w , the query is redirected to all users in the participating users' U-Net

recursively, until a predefine upper threshold, TTL ( i.e. Time-To-Live ). When-

ever a user v receives a query, it computes its top-k most relevant items with

respect to the query using a specific relevance score ( e.g. Jaccard). Then, v re-

turns them to u . A recommended item is defined by its identifier, its tf × idf

vector, v 's identifier and v 's profile. Once u receives the set of recommended

items from v 1 , ..., v n with respect to its query q ,itranksthembasedontheir

relevance with respect to the query:

idf . Distributed tf

Rec q = rank ( rec q ( it 1 , ... ) ∪ ... ∪ rec q ( it p , ... ))

(2)

Where rec q ( it 1 , ... ) is a recommendation ( i.e. a set of recommended items)

coming from a user v 1 .

Data Management in Cloud, Grid and P2P Systems

Search WWH ::

Custom Search

Home