Information Technology Reference
In-Depth Information
V
,acascade
C
is a subgraph that contains all
citation chains that end at
S
.Theset
S
is called the
seed
or
root
of the cascade.
The seed indirectly exerts influence on all papers in the cascade, but influence
decays with the distance to the seed. For a node
j
in the cascade, the cascade
generating function [4]
ˆ
(
j
) summarizes the structure of the cascade, i.e., all
citation chains, up to that point. The cascade generating function quantifies the
influence of
S
on node
j
, and is defined recursively by
ˆ
(
j
):=
1
Given one or more papers
S
∈
S
i∈cite
(
j
)
ʱˆ
(
i
)oth rw
,
if
j
∈
(1)
where
ʱ
is a constant damping factor. Figure 1(a) shows an example cascade and
the
ˆ
values for its nodes. For a paper
j
published after
T
time steps (
e.g.
,years)
from the publication of the seed,
ˆ
(
j
) can be written as
ˆ
(
j
)=
p
=0
a
p
·
ʱ
p
,
where the coecient
a
p
is the number of distinct paths of length
p
from one of
the seeds to
j
. The impact of
ʱ
is that the smaller the value of
ʱ
, the higher the
penalty against long paths. Though it is also possible to assign a unique
ʱ
ij
for
each link, assigning a constant 0.5 for all links to control its impact works well
in our experiments.
2.2 Cascade Disruption
Consider Figure 1(b).
C
is the entire cascade rooted by the seed paper. Let
C
(
c
)
denote the cascade originating from the challenger. We define the
residue cascade
,
denoted by
C
, as the complement subgraph of
C
obtained by subtracting
C
(
c
)
from
C
,
i.e.
,
C
:=
C
C
(
c
)
.
By definition, references of papers in
C
can only be traced back to the seed
papers but not the challenger. We note that it is not necessary for the challenger
to be in
C
. The blue nodes in Figure 1(b) are the root node(s) of the intersection
of
C
and
C
(
c
)
.
Let
C
t
be the set of papers in cascade
C
published at time
t
,
i.e.
, nodes in the
bottom red box in Figure 1(b). The average of the cascade function
ˆ
of papers
in
C
t
is defined by
C
(
c
)
)=
C
−
(
C
∩
\
t
1
ʱ
p
,
ʦ
t
(
C
):=
ˆ
(
j
)=
a
p
·
(2)
|
C
t
|
j∈C
t
p
=0
where
a
p
is the average of the coecient
a
p
in Eq. (2) for
j
in
C
t
,and
a
p
indicates
on average number of distinct citation chains of length
p
from papers published
at time
t
to the seeds. The variable
ʦ
t
can be interpreted as an indicator of the
seed papers' influence at time
t
.Let
t
0
be the publication time of the challenger
paper, the
disruption score
is defined as
log
ʦ
t
(
C
)
log
ʦ
t
(
C
)
.
t
0
+
˄
t
0
+
˄
log
ʦ
t
(
C
)
ʦ
t
(
C
)
ʴ
(
˄
):=
=
−
(3)
t
=
t
0
t
=
t
0
Search WWH ::
Custom Search