Biology Reference
In-Depth Information
step in the last phase to reduce the dependence on the user parameter l which is
the average number of dimensions in a cluster. We also propose a new logic of
replacing bad medoids in the iterative phase, which is more time efficient. The
overall pseudo code of our algorithm is given in Algorithm 9.1. The steps that
are new in IPROCLUS are underlined and will be discussed extensively in this
section. The detailed information about methods used in both PROCLUS and
IPROCLUS can be found in [1].
Algorithm 9.1. IPROCLUS(No. of Clusters: k , Avg. Dimensions: l )
{
C i is the i th cluster
}
{
D i is the set of dimensions associated with cluster C i }
{
M current is the set of medoids in current iteration
}
{
M best is the best set of medoids found so far
}
{
N is the final set of medoids with associated dimensions
}
{
A ; B are constant integers
}
begin
{
}
S = random sample of size A
1. Initialization Phase
k
Calculate the normalization factors for each dimension
M = Greedy( S , B
×
×
k )
{
}
M current = Random set of medoids
First Iteration
{
m 1 , m 2 , ..., m k }⊂
M
{
Approximate the optimal set of dimensions
}
for each medoid m i in M current do
Let δ i be the modified Manhattan segmental distance to the
nearest medoid from m i
L i = Points in sphere centered at m i with radius δ i
end for
L =
L 1 , ..., L k }
( D 1 , D 2 , ..., D k ) = FindDimensions( k , l , L )
{
{
}
( C 1 , ..., C k ) = AssignPoints( D 1 , D 2 , ..., D k )
bestObjective = EvaluateClusters( C 1 , ..., C k , D 1 , D 2 , ..., D k )
M best = M current
compute the bad medoids in M best
Form the clusters
Search WWH ::




Custom Search