Image Processing Reference
In-Depth Information
where m i is the mean of cluster C i and x are the data points, is used. The distance
between two clusters is then given by
d
Ward (
C
,
C
)
=
ESS C
(
C
)
ESS C
(
)
ESS C
(
)
(16.9)
1
2
1
2
1
2
The Ward method merges two groups such that their heterogeneity does not
change too much. Several applications of hierarchical clustering to fMRI data
analysis can be found [11,13-15]. As an interesting application in Reference 13,
a hierarchical clustering approach was proposed on resting-state data, by means
of a single-link method on time series. As a distance measure between two voxels,
the correlation coefficient between the corresponding time series was used, using
only frequencies below 0.1 Hz. This frequency range was found to contribute to
resting-state connectivity [51]; this method requires a high sampling rate in order
to isolate the signal contributions related to heart and respiratory activities.
16.3.2.2
Hard Partitioning Methods
Partitioning methods result in a single partition instead of a treelike structure: the
main difference is that the number of clusters must be specified in advance, and
each element of the data set will be evaluated individually and assigned to a
cluster in order to minimize a cost criterion that can be evaluated locally or
globally. This is accomplished through an iterative process; starting from an initial
guess for cluster centers, the clusters can be modified until a stable solution is
found. The iterative process can be started from a random cluster initialization
and may depend on this first step. Several solutions with different random ini-
tializations can be performed, or a first hierarchical clustering step can be applied.
Partitioning methods are more efficient than hierarchical ones, which are com-
putationally demanding for large data sets. One of the most used partitioning
algorithms is the k -means method [52], which iteratively minimizes the within-
class inertia. The algorithm starts with an initial guess for the k centers chosen
randomly from the data set. In the second step, each element in the data set is
assigned to the nearest center using the definition of a distance metric. The centers
are then recomputed as the average of the elements in the cluster. The algorithm
repeats these last two steps until the partition does not significantly change. In
Figure 16.4 the results of the k -means algorithm for two-dimensional objects
grouped in three clusters are shown. Although this method is fast and results in
homogeneous clusters, its main drawback is that the number of clusters must be
specified in advance. The algorithm can be modified to overcome this problem:
this method was applied in the ISODATA algorithm [53] and consists of splitting
a cluster when its variance is above a threshold and merging two clusters when
the distance between their centers' is smaller than a threshold. In the context of
fMRI data analysis, a comparison between a hierarchical algorithm and the k -means
can be found in Reference 11. The two approaches are seen as complementary: the
k -means is powerful and results in homogeneous clusters but requires that the
number of clusters be chosen a priori , whereas the hierarchical approach allows
Search WWH ::




Custom Search