Information Technology Reference
In-Depth Information
efficiency and product quality [19]. Several parameters such as connectivity, inten-
sively and distance among data characteristics determine the level of similarity.
Usually in clustering methods data element belongs to exactly one cluster, which is
famous as hard clustering, however, among them, a soft clustering method that is
called fuzzy clustering calculates the relativity of each module ( X = x 1 , x 2 , …, x n ) to
the specified clusters ( C = c 1 , c 2 , …, c c ) with membership values ( M = m 1 , m 2 , …,
m n ) varies from zero to one. In this method, data elements belong to one or more
clusters at the same time. The C-means clustering is one of the most important
fuzzy clustering techniques developed in 1973 [25] and improved in 1981 [26].
Variety of different application has used this method to solve their problems. In this
method, the final aim is to minimize a target function as shown in Eq. 1.
= ∑∑
, 1≤<∞ (1)
u ij is the membership degree of x i from the center of cluster j ( c j ), and || || is the
difference expressing the similarity between data ( x i ) and the center of cluster j ( c j ).
3.1 C-means Clustering Algorithm
In C-means clustering, first a set of random initial membership values (U (0) = u ij ) are
generated from each data module x i for each cluster c j . Then center vector of each
cluster is calculated based on Eq. 2 for k number of times. After that u (k) and u (k+1) is
updated according to Eq. 3. And finally if difference between u (k) and u (k+1) is less
that the threshold, the iteration stops, otherwise, new cluster's centers are employed
based on Eq. 2
= .
, () = () (2)
=
(3)
4
Proposed Method
Clustering of the software projects is the key part of estimation method proposed in
this section. To overcome the diversity and inconsistency of the projects collected
in a dataset, it is required to separate the outliers and irrelevant projects from other
ones. The modules clustering can increase the consistency of modules by putting
similar modules in the same clusters. Instead of having a dataset, which includes
numerous irrelevant and inconsistent modules, there will be several subsets
comprising of consistent and similar modules. Clustering process is performed by
analyzing the modules features to discriminate the most similar modules and
Search WWH ::




Custom Search