Digital Signal Processing Reference
In-Depth Information
Several general-purpose algorithms and techniques have been developed for image
segmentation. Since there is no general solution to the image segmentation problem,
these techniques often have to be combined with domain knowledge in order to
effectively solve an image segmentation problem for a problem domain.
Clustering methods
The K-means algorithm is an iterative technique that is used to partition an image into K
clusters. The basic algorithm is:
1. Pick K cluster centers, either randomly or based on some heuristic
2. Assign each pixel in the image to the cluster that minimizes the distance between
the pixel and the cluster center
3. Re-compute the cluster centers by averaging all of the pixels in the cluster
4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)
In this case, distance is the squared or absolute difference between a pixel and a cluster
center. The difference is typically based on pixel color, intensity, texture, and location, or
a weighted combination of these factors. K can be selected manually, randomly, or by a
heuristic.
This algorithm is guaranteed to converge, but it may not return the optimal solution. The
quality of the solution depends on the initial set of clusters and the value of K .
In statistics and machine learning, the k-means algorithm is a clustering algorithm to
partition n objects into k clusters, where k < n. It is similar to the expectation-
maximization algorithm for mixtures of Gaussians in that they both attempt to find the
centers of natural clusters in the data. The model requires that the object attributes
correspond to elements of a vector space. The objective it tries to achieve is to minimize
total intra-cluster variance, or, the squared error function. The k-means clustering was
invented in 1956. The most common form of the algorithm uses an iterative refinement
heuristic known as Lloyd's algorithm. Lloyd's algorithm starts by partitioning the input
points into k initial sets, either at random or using some heuristic data. It then calculates
the mean point, or centroid, of each set. It constructs a new partition by associating each
point with the closest centroid. Then the centroids are recalculated for the new clusters,
and algorithm repeated by alternate application of these two steps until convergence,
which is obtained when the points no longer switch clusters (or alternatively centroids are
no longer changed). Lloyd's algorithm and k-means are often used synonymously, but in
reality Lloyd's algorithm is a heuristic for solving the k-means problem, as with certain
combinations of starting points and centroids, Lloyd's algorithm can in fact converge to
the wrong answer. Other variations exist, but Lloyd's algorithm has remained popular,
because it converges extremely quickly in practice. In terms of performance the
algorithm is not guaranteed to return a global optimum. The quality of the final solution
depends largely on the initial set of clusters, and may, in practice, be much poorer than
the global optimum. Since the algorithm is extremely fast, a common method is to run the
algorithm several times and return the best clustering found. A drawback of the k-means
Search WWH ::




Custom Search