Database Reference
In-Depth Information
More precisely, our peer-to-peer model is expressed based on a graph
G
=
(
U,I,E
), where
U
=
{u
1
, ..., u
n
}
is the set of users distributed over the network,
I
=
{i
1
, ..., i
m
}
the set of shared data items (in the following, an item refers
to a data item), and
E
=
the set of directed edges among users
and between users and items. This model is very generic. In our case, users are
independent nodes in the network. A node can be a physical computer or a
virtual node in a server.
{e
1
, ..., e
k
}
Definition 1 (
U-Net
).
Given a user
u
, its User Network, or U-Net, refers to
the cluster of relevant users
u
is aware of. There is an edge
e
(
u,v
)
in the graph
between
u
and a user
v
,if
v
is in
u
's U-Net.
With random gossiping [5], each user keeps locally a random view of its dy-
namic acquaintances (or view entries). Each view entry corresponds to a user
profile. Periodically, each user chooses randomly a contact (view entry) to gossip
with. The two involved users then exchange a subset of each others view (
i.e.
user profiles), and update their view state. Then, after each gossip exchange, the
random view is used to update the
U-Net
if more relevant profiles are found in
the updated view. We use
Jaccard
as the relevance score to select the best users:
Jaccard
(
u
,
v
)=
|I
u
∩ I
v
| / |I
u
∪ I
v
|
(1)
Where
I
u
and
I
v
are the items shared by user
u
and
v
respectively.
Here, we use the vector space model to represent items and user profiles [10].
Specifically, each item
it
is modelled as a sparse vector containing only the
weights of the keywords
k
1
, ..., k
z
in it. The weight of each keyword is computed
using
tf
idf
can be easily implemented using gossip pro-
tocols. Indeed, the first part of the score, denoted
tf
, can be computed locally,
and the second part, denoted
idf
, only needs average information (
e.g.
average
number of items per user) to be computed. These averages can be easily com-
puted using gossip protocols [11]. Due to lack of space, we do not develop this
protocol in this paper.
Each
user profile
is defined based on the items the user shares,
I
u
(
i.e.
con-
tent based recommendation). We choose a relevance score (
i.e. Jaccard
)that
works well with content-based recommendation, but other relevance measures
and profiles definition methods could be used as well.
As mentioned before, whenever a user
u
submits a keywords query
q
=
k
1
, ..., k
w
, the query is redirected to all users in the participating users'
U-Net
recursively, until a predefine upper threshold,
TTL
(
i.e. Time-To-Live
). When-
ever a user
v
receives a query, it computes its
top-k
most relevant items with
respect to the query using a specific relevance score (
e.g.
Jaccard). Then,
v
re-
turns them to
u
. A recommended item is defined by its identifier, its
tf
×
idf
vector,
v
's identifier and
v
's profile. Once
u
receives the set of recommended
items from
v
1
, ..., v
n
with respect to its query
q
,itranksthembasedontheir
relevance with respect to the query:
×
idf
. Distributed
tf
×
Rec
q
=
rank
(
rec
q
(
it
1
, ...
)
∪ ... ∪ rec
q
(
it
p
, ...
))
(2)
Where
rec
q
(
it
1
, ...
) is a recommendation (
i.e.
a set of recommended items)
coming from a user
v
1
.
Search WWH ::
Custom Search