Information Technology Reference
In-Depth Information
Fig. 5.1
An inclusion map of research in mass extinction based on index terms of articles on mass
extinction published in 1990. The size of a node is proportional to the total number of occurrences
of the word. Links that violate first-order triangle inequality are removed (©
D
0.75)
The original co-word analysis prunes a concept graph using a triangle inequality
rule on conditional probabilities. Suppose we have a total of N words in the analysis,
for 1
i, j, k
N, ¨
ij
, ¨
ik
,and¨
kj
represent the weights of links in the network and
¨
ij
is defined as 1 - I
ij
. Given a pre-defined small threshold ©, if there exists an index
k
such that ¨
ij
>¨
ik
*¨
kj
C
©, then we should remove the link I
ij
. Because ¨
ik
*¨
kj
defines the weight of a path from term t
i
to t
j
, what this operation means is if we can
find a shorter path from term t
i
to t
j
than the direct path, then we choose the shorter
one. In other words, if a link violates the triangle inequality, it must be invalid; there-
fore, it should be removed. By rising or lowering the threshold ©, we can decrease
or increase the number of valid links in the network. This algorithm is simple to
implement. In co-word analysis, we usually only compare a one-step path with a
two-step path. However, when the size of the network increases, this simple algo-
rithm tends to allow in too many links and the resultant co-word map tends to lose
its clarity. In next chapter, we will introduce Pathfinder network scaling as a generic
form of the triangle inequality condition, which enable us compare much longer
paths connecting two points and detect much subtle association patterns in data.
Figure
5.1
shows a co-word map based on the inclusion index. The co-word
analysis was conducted on index terms of articles published in 1990 from a search in
the Web of Science with the query “mass extinction”. The meaning of this particular
Search WWH ::
Custom Search