what-when-how
In Depth Tutorials and Information
12.4 TheLinkPredictionProblem
forSocialNetworks[3]
12.4.1 Model of the Link Prediction
Let G = <V,E> represent a graph of a social network model. V denotes the all the nodes
in the social network and E denotes all the edges between each nodes. For the e E, it is
one edge between two nodes at a specific time t. In different time periods, we can record
all the interactions among edges. For a given time period t < t', G[t, t'] represents a subset
or a subgraph of G in which all the communication between nodes must happen in a
given time period [t, t']. he training interval [t 0 , t 0 '] is a time period [t 1 , t 1 '] in which we
know all the edges and t 1 > t 0 . Correspondingly, the test interval is a time period in which
the edges are unknown. For the G[t 0 , t 0 '], it must generate some possible edges that are
not included in the G[t0, t0']. hey are the outcome of the network prediction.
In the experiments explained in [3], we may select some authors, and each of
them has a number of papers. A given set Core represents Jtrain edges in the train-
ing interval and Jtest edges in the test interval. Enews is the set of edges <a,b> in
which authors a and b collaborate during the test interval instead of the set Eold,
which is the set of edges in which the two authors have no collaboration in the
training interval. Gcollab: = <V,Eold>.
12.4.2 Methods for Link Prediction
Score(x,y) denotes the connection weight score. We can calculate the proximity or
similarity between nodes x and y, which are related to the geographical position.
he basic method to find the shortest path between nodes is to use the small-world
network theory. Small-world network can be explained as a type of social network
in which most nodes are not directly connected to each other. However, they can
reach other nodes by a several hops or steps.
Let φ (x) denote the set of neighbors of a given node x. We can infer that if the
sets φ (x) and φ (y) have a large number of intersection sets, the two nodes x and y
have more possibility to contact with each other.
12.4.3 Common Neighbors
Following this idea, we can measure score as follows: score(x,y) = φ (x) φ (y), which
is the number of the common neighbors of x and y. It has been proved that the
number of common neighbors of nodes a and b at time t has a relationship with the
possibility of their collaborations in the future.
Jaccard'scoeicient: he Jaccard coefficient was widely applied in the message
retrieval. he weighted score can be measured as the number of the intersection
between x and y, which was compared to the union set of x and y. he formulation
was given as: score(x, y) = | φ (x) φ (y)|.
Search WWH ::




Custom Search