The k-Dense Method to Extract Communities from Complex Networks - Mining Complex Data

Information Technology Reference

In-Depth Information

D s ( k ):1

R s ( k ):1

{

.Here S C ( k ) , S D ( k )

and S R ( k ) are the number of communities (or connected components) in C ( k ), D ( k )

and R ( k ) respectively. Let X ( k )beeither C ( k ), D ( k )or R ( k ), the total size of com-

munities means the number of nodes that belong to a union of extracted commu-

nities, i.e.,

≤

S D ( k ) }

,and R ( k )=

{

≤

S R ( k ) }

. Note that in the case of k -clique communities, some nodes may

belong to multiple communities, but we avoid their duplicate counts. The size of the

maximum community and the number of communities correspond to max

V X ( k ) |

V X ( k ) |}

and S X ( k ) , respectively. The normalized entropy is a measure defined as follows:

S X ( k )

V X ( k ) |

s |

V X ( k ) |

s |

log( S X ( k ) )

−

( X ( k )) =

log

(14.11)

V X ( k ) |

s =1

This measure is close to 1 when the variance in community sizes is small, i.e.,

community sizes

V X ( k ) |

are almost the same, while it is close to 0 when one com-

munity becomes dominant i.e., X ( k ) consists of one extremely large community

and other much smaller ones. Note that we can define this measure

( X ( k ))

only when there exist more than one communities, i.e., S X ( k ) ≥

14.3.2

Evaluation Using the Blog Trackback Network

The k -core, k -dense and k -clique methods are applied to the Blog trackback

network labeled Blog in Table 14.1. Let k max be the maximum core order, up

to which non-empty communities can be extracted. The results of the k -dense

method and the k -clique method both show that k max = 15, while the result

of the k -core method shows k max = 20. Figure 14.4(a) shows the total size of

the extracted communities,

, for each order k , obtained by the k -core

method (labeled as k-core), the k -dense method (labeled as k-dense) and the

k -clique method (labeled as k-clique) respectively. We can see that for each k ,

the total sizes of communities obtained by the k -dense method and the k -clique

method are almost the same, while the total size of communities obtained by

the k -core method is substantially larger than both of them.

Figure 14.4(b) shows the size of the maximum community, max

V X ( k )

,for

each order k obtained by the k -core, k -dense and k -clique methods. When k is

in the range of 7

V X ( k ) |}

13, the maximum sizes of communities obtained by the

k -dense and k -clique methods are almost the same, while the size obtained by

the k -core method is substantially larger than them. These experimental results

indicate that the results of the k -core method are not well balanced compared to

the other two methods in the sense that the k -core method is likely to produce

one giant community and other much smaller communities.

Figure 14.4(c) shows the number of extracted communities S X ( k ) for each order

k obtained by the k -core, k -dense and k -clique methods. It can be seen that the

number of communities obtained by the k -dense method is relatively smaller than

that obtained by the k -clique method when k

≤

6, and as k becomes larger, the

difference between the two becomes much smaller. When k

≤

6, the number of

communities extracted by the k -clique method is larger than 100 and the size of

≤

Mining Complex Data

Search WWH ::

Custom Search

Home