A Projected Clustering Algorithm and Its Biomedical Application - Clustering Challenges in Biological Network

Biology Reference

In-Depth Information

step in the last phase to reduce the dependence on the user parameter l which is

the average number of dimensions in a cluster. We also propose a new logic of

replacing bad medoids in the iterative phase, which is more time efficient. The

overall pseudo code of our algorithm is given in Algorithm 9.1. The steps that

are new in IPROCLUS are underlined and will be discussed extensively in this

section. The detailed information about methods used in both PROCLUS and

IPROCLUS can be found in [1].

Algorithm 9.1. IPROCLUS(No. of Clusters: k , Avg. Dimensions: l )

{

C i is the i th cluster

}

{

D i is the set of dimensions associated with cluster C i }

{

M current is the set of medoids in current iteration

}

{

M best is the best set of medoids found so far

}

{

N is the final set of medoids with associated dimensions

}

{

A ; B are constant integers

}

begin

{

}

S = random sample of size A

1. Initialization Phase

k

Calculate the normalization factors for each dimension

M = Greedy( S , B

×

k )

{

}

M current = Random set of medoids

First Iteration

{

m 1 , m 2 , ..., m k }⊂

M

{

Approximate the optimal set of dimensions

}

for each medoid m i in M current do

Let δ i be the modified Manhattan segmental distance to the

nearest medoid from m i

L i = Points in sphere centered at m i with radius δ i

end for

L =

L 1 , ..., L k }

( D 1 , D 2 , ..., D k ) = FindDimensions( k , l , L )

{

}

( C 1 , ..., C k ) = AssignPoints( D 1 , D 2 , ..., D k )

bestObjective = EvaluateClusters( C 1 , ..., C k , D 1 , D 2 , ..., D k )

M best = M current

compute the bad medoids in M best

Form the clusters

Clustering Challenges in Biological Network

Search WWH ::

Custom Search

Home