Segmentation (Image Processing) - Important Concepts in Signal Processing, Image Processing and Data Compression

Digital Signal Processing Reference

In-Depth Information

Several general-purpose algorithms and techniques have been developed for image

segmentation. Since there is no general solution to the image segmentation problem,

these techniques often have to be combined with domain knowledge in order to

effectively solve an image segmentation problem for a problem domain.

Clustering methods

The K-means algorithm is an iterative technique that is used to partition an image into K

clusters. The basic algorithm is:

1. Pick K cluster centers, either randomly or based on some heuristic

2. Assign each pixel in the image to the cluster that minimizes the distance between

the pixel and the cluster center

3. Re-compute the cluster centers by averaging all of the pixels in the cluster

4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)

In this case, distance is the squared or absolute difference between a pixel and a cluster

center. The difference is typically based on pixel color, intensity, texture, and location, or

a weighted combination of these factors. K can be selected manually, randomly, or by a

heuristic.

This algorithm is guaranteed to converge, but it may not return the optimal solution. The

quality of the solution depends on the initial set of clusters and the value of K .

In statistics and machine learning, the k-means algorithm is a clustering algorithm to

partition n objects into k clusters, where k < n. It is similar to the expectation-

maximization algorithm for mixtures of Gaussians in that they both attempt to find the

centers of natural clusters in the data. The model requires that the object attributes

correspond to elements of a vector space. The objective it tries to achieve is to minimize

total intra-cluster variance, or, the squared error function. The k-means clustering was

invented in 1956. The most common form of the algorithm uses an iterative refinement

heuristic known as Lloyd's algorithm. Lloyd's algorithm starts by partitioning the input

points into k initial sets, either at random or using some heuristic data. It then calculates

the mean point, or centroid, of each set. It constructs a new partition by associating each

point with the closest centroid. Then the centroids are recalculated for the new clusters,

and algorithm repeated by alternate application of these two steps until convergence,

which is obtained when the points no longer switch clusters (or alternatively centroids are

no longer changed). Lloyd's algorithm and k-means are often used synonymously, but in

reality Lloyd's algorithm is a heuristic for solving the k-means problem, as with certain

combinations of starting points and centroids, Lloyd's algorithm can in fact converge to

the wrong answer. Other variations exist, but Lloyd's algorithm has remained popular,

because it converges extremely quickly in practice. In terms of performance the

algorithm is not guaranteed to return a global optimum. The quality of the final solution

depends largely on the initial set of clusters, and may, in practice, be much poorer than

the global optimum. Since the algorithm is extremely fast, a common method is to run the

algorithm several times and return the best clustering found. A drawback of the k-means

Search WWH ::

Custom Search

Home