Identifying Transformative Research in Biomedical Sciences - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

V ,acascade C is a subgraph that contains all

citation chains that end at S .Theset S is called the seed or root of the cascade.

The seed indirectly exerts influence on all papers in the cascade, but influence

decays with the distance to the seed. For a node j in the cascade, the cascade

generating function [4] ˆ ( j ) summarizes the structure of the cascade, i.e., all

citation chains, up to that point. The cascade generating function quantifies the

influence of S on node j , and is defined recursively by

ˆ ( j ):= 1

Given one or more papers S

∈

i∈cite ( j ) ʱˆ ( i )oth rw ,

if j

∈

(1)

where ʱ is a constant damping factor. Figure 1(a) shows an example cascade and

the ˆ values for its nodes. For a paper j published after T time steps ( e.g. ,years)

from the publication of the seed, ˆ ( j ) can be written as ˆ ( j )= p =0 a p ·

ʱ p ,

where the coecient a p is the number of distinct paths of length p from one of

the seeds to j . The impact of ʱ is that the smaller the value of ʱ , the higher the

penalty against long paths. Though it is also possible to assign a unique ʱ ij for

each link, assigning a constant 0.5 for all links to control its impact works well

in our experiments.

2.2 Cascade Disruption

Consider Figure 1(b). C is the entire cascade rooted by the seed paper. Let C ( c )

denote the cascade originating from the challenger. We define the residue cascade ,

denoted by C , as the complement subgraph of C obtained by subtracting C ( c )

from C , i.e. ,

C := C

C ( c ) .

By definition, references of papers in C can only be traced back to the seed

papers but not the challenger. We note that it is not necessary for the challenger

to be in C . The blue nodes in Figure 1(b) are the root node(s) of the intersection

of C and C ( c ) .

Let C t be the set of papers in cascade C published at time t , i.e. , nodes in the

bottom red box in Figure 1(b). The average of the cascade function ˆ of papers

in C t is defined by

C ( c ) )= C

−

( C

∩

ʱ p ,

ʦ t ( C ):=

ˆ ( j )=

a p ·

(2)

C t |

j∈C t

p =0

where a p is the average of the coecient a p in Eq. (2) for j in C t ,and a p indicates

on average number of distinct citation chains of length p from papers published

at time t to the seeds. The variable ʦ t can be interpreted as an indicator of the

seed papers' influence at time t .Let t 0 be the publication time of the challenger

paper, the disruption score is defined as

log ʦ t ( C )

log ʦ t ( C ) .

t 0 + ˄

log ʦ t ( C )

ʦ t ( C )

ʴ ( ˄ ):=

−

(3)

t = t 0

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home