Biology Reference
In-Depth Information
Table 11.5.
Performance comparison on Acxiom dataset using Q N and Q S respectively
Alg.
Expected clusters
Expected Q N
Best Q N
clusters
errors
GA+ Q N
4
0.378893
0.399623
3
3
FHAC+ Q N
4
0.378893
0.37669
3
3
Alg.
Expected clusters
Expected Q S
Best Q S
clusters
errors
GA+ Q S
4
0.474738
0.474738
4
0
FHAC+ Q S
4
0.474738
0.474738
4
0
Table 11.6.
FHAC running time for very large network
Vertices
Edges
FHAC+ Q N
FHAC+ Q S
52909
245300
591 Sec.
658 Sec.
The final experiment was on a very large dataset - the DBLP authors' col-
laboration networks of 2005 [10], which consist of 52,909 vertices (authors) and
245,300 edges (co-author relationship). The purpose of this experiment is testing
the speed of FHAC on a very large network. The running time on an Intel PC
with P3.2G CPU and 2GB memory are reported in Table 11.6. One can see that
optimizing Q S it runs marginally slower than optimizing Q N , which means Q S
cooperates with the FHAC algorithm very well.
11.7. Conclusion
In this paper, we propose a novel similarity-based modularity ( Q S ) to measure the
quality of a graph partitioning. Through theoretical analysis and extensive exper-
iments, we can conclude that the propose measure is significantly more accurate
and robust than Newman's connection-based modularity with respect to the result
of clustering. Furthermore, it has a better ability to deal with hubs and outlin-
ers. The proposed similarity-based modularity in combination with the Genetic
clustering algorithm (GA) and the greedy search algorithm FHAC yields an im-
proved accuracy for even dense, confused graphs. The FHAC often converges
to the global optimal for real applications using the proposed modularity. How-
ever, in some very tough cases, such as in very confused synthetic graphs, FHAC
significantly lags the global optimal obtained by GA. This suggests us to further
study more powerful fast clustering algorithm in the future to exert the potential
of the proposed modularity definition.
References
[1]
A. Clauset, M. Newman, and C. Moore. Finding community structure in very large
networks , Physical Review E , 70:066111, 2004.
[2]
C. R. Dias and L. S. Ochi. Efficient evolutionary algorithms for the clustering prob-
lem in directed graphs. The Congress on Evolutionary Computation, CEC '03 , 2003.
Search WWH ::




Custom Search