Database Reference
In-Depth Information
4
0.5
0.45
3.5
0.4
3
0.35
2.5
0.3
2
0.25
0.2
1.5
0.15
1
0.1
0.5
0.05
0
0
0
500
1000
1500
2000
2500
0
500
1000
1500
2000
2500
(a)
(b)
Fig. 3.5. Effect of subsampling on Opossum . Cluster quality as measured by re-
maining edge weight fraction (a) and imbalance (b) of total graph with 2466 vertices
(customers from Section 3.5.1) for various boot-sample sizes n in FastOpossum .
For each setting of s the results' range and mean of 10 trials are depicted. Using
all 2466 customers as the boot-sample (i.e., no subsampling) results in balanc-
ing within the 1.05 imbalance requirement and approximately 40% of edge weight
remaining (compared to 5% baseline for random clustering).As the boot-sample
becomes smaller the remaining edge weight stays approximately the same (a), but
the imbalance increases (b).
than 1% at comparable remaining edge weight but at an imbalance of 3.5 in
the worst of 10 random trials. These results indicate that scaling to large n
is easily possible, if one is willing to relax the balance constraints.
3.6.4 Parallel Implementation
Another notion of scalability is with respect to the number of processors
(speedup, iso-e ciency etc.). Our analysis [3.2] shows almost linear speedup
for our method, as the similarity computation as well as graph partition-
ing can both be fairly trivially parallelized with little overhead. Parallel im-
plementation of the all-pair similarity computation on SIMD or distributed
memory processors is trivial. It can be done in a systolic or block systolic
manner with essentially no overhead. Frameworks such as MPI also provide
native primitives for such computations. Parallelization of Metis is also very
e cient, and [3.30] reports partitioning of graphs with more than 7 million
vertices in 7 seconds into 128 clusters on a 128 processor Cray T3E. For
further details, see [3.2].
3.7 Related Work
Clustering has been widely studied in several disciplines, especially since
the late 1960s [3.14], [3.31]. Classic approaches include partitional methods
Search WWH ::




Custom Search