Information Technology Reference
In-Depth Information
D
s
(
k
):1
R
s
(
k
):1
{
.Here
S
C
(
k
)
,
S
D
(
k
)
and
S
R
(
k
)
are the number of communities (or connected components) in
C
(
k
),
D
(
k
)
and
R
(
k
) respectively. Let
X
(
k
)beeither
C
(
k
),
D
(
k
)or
R
(
k
), the total size of com-
munities means the number of nodes that belong to a union of extracted commu-
nities, i.e.,
≤
s
≤
S
D
(
k
)
}
,and
R
(
k
)=
{
≤
s
≤
S
R
(
k
)
}
. Note that in the case of
k
-clique communities, some nodes may
belong to multiple communities, but we avoid their duplicate counts. The size of the
maximum community and the number of communities correspond to max
|
V
X
(
k
)
|
V
X
(
k
)
|}
and
S
X
(
k
)
, respectively. The
normalized entropy
is a measure defined as follows:
{|
S
X
(
k
)
V
X
(
k
)
|
s
|
V
X
(
k
)
|
s
|
|
|
1
log(
S
X
(
k
)
)
−
E
(
X
(
k
)) =
log
.
(14.11)
V
X
(
k
)
|
V
X
(
k
)
|
s
=1
This measure is close to 1 when the variance in community sizes is small, i.e.,
community sizes
V
X
(
k
)
|
are almost the same, while it is close to 0 when one com-
munity becomes dominant i.e.,
X
(
k
) consists of one extremely large community
and other much smaller ones. Note that we can define this measure
|
E
(
X
(
k
))
only when there exist more than one communities, i.e.,
S
X
(
k
)
≥
2.
14.3.2
Evaluation Using the Blog Trackback Network
The
k
-core,
k
-dense and
k
-clique methods are applied to the Blog trackback
network labeled
Blog
in Table 14.1. Let
k
max
be the maximum core order, up
to which non-empty communities can be extracted. The results of the
k
-dense
method and the
k
-clique method both show that
k
max
= 15, while the result
of the
k
-core method shows
k
max
= 20. Figure 14.4(a) shows the total size of
the extracted communities,
, for each order
k
, obtained by the
k
-core
method (labeled as k-core), the
k
-dense method (labeled as k-dense) and the
k
-clique method (labeled as k-clique) respectively. We can see that for each
k
,
the total sizes of communities obtained by the
k
-dense method and the
k
-clique
method are almost the same, while the total size of communities obtained by
the
k
-core method is substantially larger than both of them.
Figure 14.4(b) shows the size of the maximum community, max
|
V
X
(
k
)
|
,for
each order
k
obtained by the
k
-core,
k
-dense and
k
-clique methods. When
k
is
in the range of 7
{|
V
X
(
k
)
|}
13, the maximum sizes of communities obtained by the
k
-dense and
k
-clique methods are almost the same, while the size obtained by
the
k
-core method is substantially larger than them. These experimental results
indicate that the results of the
k
-core method are not well balanced compared to
the other two methods in the sense that the
k
-core method is likely to produce
one giant community and other much smaller communities.
Figure 14.4(c) shows the number of extracted communities
S
X
(
k
)
for each order
k
obtained by the
k
-core,
k
-dense and
k
-clique methods. It can be seen that the
number of communities obtained by the
k
-dense method is relatively smaller than
that obtained by the
k
-clique method when
k
≤
k
≤
6, and as
k
becomes larger, the
difference between the two becomes much smaller. When
k
≤
6, the number of
communities extracted by the
k
-clique method is larger than 100 and the size of
≤