Biology Reference
In-Depth Information
Y
Z
Q
P
Z
X
X
(b)
(a)
Z
R
S
Y
(c)
Fig. 9.1. Illustrations of Different Clusters in Different Subspaces (a) Data in 3-Dimensional Space.
(b) Projection in X-Z Plane. (c) Projection in Y-Z Plane.
reserved as much as possible. Three criteria have been proposed to evaluate clus-
ters [14]. A good cluster should have as many points as possible. Its dimensional-
ities should be as large as possible and the distance between points in the cluster
should be as small as possible. Actually, there is a tradeoff among these criteria.
If one criterion is fixed, the other two criteria are at odds.
In this paper, we propose our algorithm, IPROCLUS, which is based on PRO-
CLUS [1]. We find that the closeness of points in different dimensions not only de-
pends on the distance between them, but also relates to the distributions of points
along those dimensions. PROCLUS uses the Manhattan segmental distance which
loses its effectiveness when points in different dimensions have very different vari-
ance. We propose the modified Manhattan segmental distance which is more ac-
curate and meaningful in projected clustering. PROCLUS strongly depends on
two user inputs. In order to reduce the dependence on one of the user parameters,
Search WWH ::




Custom Search