Information Technology Reference
In-Depth Information
Table 3 Keyword-Frequency information of the email in Figure 1
keyword frequency
bank 1
fund 2
account 2
transfer 2
2.2.2 Construct a Weighted Directed Multigraph
For a given do cument D and a set of keywords
K
,let G m
be a weighted
directed multigraph G m
with the vertex set
K
=
{
K 1 , K 2 ,
···
, K m }
con-
structed as follows.
Suppose that k 1 ,
,k s is the sequence of words such that
(1) each k μ is a keyword of the given set
···
K
,
(2) k 1 ,
,k s appear in the document D in this order,
(3) the position of the word k μ
···
in the document D is p μ (the p μ -th word in
the document D ,(1
p 1 <
···
<p s ).
Add an arc from the vertex k i
to the vertex k j
with the weight w m ij
=
p j − p i + 1, which is the distance from the word k i
to the word k j
in the
document D .
Note that if k i and k j are the same element of the set
K
,theyarethesame
vertex in the graph.
A large weight for a given arc indicates that the corresponding pair of
keywords are relatively far away from each other and, therefore, their logical
connection are relatively “weak” in the document. Thus, we may ignore those
arcs with large weights. (We choose a threshold = 200 in our example in
Figure 1 and delete any arc with weight greater than 200.)
Note that the resulted weighted directed multigraph may contain not only
parallel arcs but also loops.
For the given example (Figure 1), its corresponding weighted directed
multigraph is Figure 2.
2.2.3 Simplification of Representing Graphs
The weighted directed multigraph G m constructed in the previous step is
further simplified as follows (a directed graph G s is constructed from G m ,in
which, parallel arcs are combined).
Let E ij
{
k μ k ν |
k μ = K i & k ν
= K j }
=
, which is the set of all arcs from
the vertex K i to the vertex K j
of the weighted directed multigraph G m .
be the vertex set of the new directed graph
G s . For each pair of vertices K i
Let
K
=
{
K 1 , K 2 ,
···
, K m }
and K j
( i, j =1 , 2 , ..., m ), if E ij
=
, put
an arc e ij
from K i to K j .Theweightofthearc e ij = K i K j
is calculated as
follows,
 
Search WWH ::




Custom Search