Digital Signal Processing Reference
In-Depth Information
However, there is still a drawback existing in these improved K-means algorithms
which is that they group examples just based on their attribute values. It is shown that
relation information of data sets provides the hidden structure information of data sets
and is very useful for clustering [13-14]. Hence, it is reasonable to combine the
attribute values and relation information of data sets together for handling clustering
tasks. Wang. et al recently proposed an integrated KL (K-means - Laplacian) cluster-
ing algorithm to use both of the attribute values and pairwise relations of data sets
[15]. In the applications, this approach seems an effective way to make use of mul-
tiple information sources of data sets. However, it should be pointed that a full con-
nected graph is used in KL clustering algorithm. A full connected graph may not be
appropriate to represent the similarity relation between data points, and the pairwise
similarity of distant points is not reliable [16-17]. In addition, the abundant super-
fluous edges in the graph will cause algorithms based on it less efficient. Hence,
graph sparsification is certainly required in KL clustering.
The most frequently used graph sparsification method is K-nearest neighbor.
However, as many reports claimed [18-19], KNN adjacency graph may not be a finer
approach for approximating the structure of data sets. So this paper will improve the
KL clustering algorithm with graphs constructed by other graph sparsification tech-
niques such as b-matching and sparse representation. B-matching algorithm ensures
the graph is exactly regular, namely every node has b neighbors at termination.
Graphs constructed by sparse representation ( l graph) also have many merits such as
sparsity and datum-adaptive neighborhoods. Therefore, based on the new constructed
graphs, the integrated KL clustering will get great improvements.
2
Background
2.1 Integrated KL Clustering
(, , , )
xx
x
=∈
X R
Given a data matrix
and their pairwise relationship
12
n
dn
×
WR such that i w represents the similarity between x and x . Sup-
pose the data set has C classes. The objective function of integrated KL algorithm is
[15]
matrix
nn
×
J
=
min(
α
J
+
(1
α
)
J
)
(1)
k
s
where 0
is a tradeoff parameter, J and J are the objective functions of
K-means clustering and spectral clustering respectively. The inputs of the spectral
clustering method is the relationship matrix W . For free the tradeoff parameter, a
trace quotient formulation of the objective function has been proposed, namely
<<
1
J
J
=
min(
s
)
(2)
J
k
Search WWH ::




Custom Search