Segmentation (Image Processing)

The goal of image segmentation is to divide an image into constituent parts that correlate to objects within the image. Once these objects have been extracted from the scene, information about them, such as their location, orientation, area, and so on may be gleaned and used towards a specific application. Typically this involves passing such information back to some higher-level processing code that then uses this data to make decisions pertaining to the application at hand.

Image segmentation is a vast topic, however the classes of algorithms fall roughly under two categories: those that are based on abrupt changes in intensity and those where the image is divided into regions that are similar in accordance with a predefined criterion. Edge detection, the topic of 5.1, falls under the first category. Here we focus on the second category of segmentation algorithms, and in particular histogram thresholding. Histogram thresholding was introduced in 3.2 within the context of the window/level technique, and thresholding of the gradient was used in 5.1 to turn edge-enhanced images into binary images. In 3.2, a human operator was assumed to be in the loop to alter the window and level parameters. In this section, the focus is on autonomous methods of inspecting the histogram to choose an appropriate threshold value for use in image segmentation. In addition, the techniques developed here will also be used to improve the Sobel edge detection algorithm (see Algorithm 5-1).


Image segmentation applications are too numerous to list exhaustively. Some noted examples include:

• X-ray luggage inspection: Inspection of carry-on luggage is an essential airport security measure, and various image processing techniques are used to aid the screener in interpretation of these images. Automatic or semi-automatic image processing systems are in use today to help combat concealment of weapons and dangerous items. Obviously, the images are quite complex but sophisticated segmentation algorithms are used to help separate certain dangerous materials from innocuous items.

• Microscopy: Segmentation is often employed to separate specimens and various structures from the background of images acquired from a microscope.

• Medical imaging: In computer-aided diagnosis, segmentation algorithms operating on radiographs or x-ray projection images are used for automatic identification of pathological lesions in the breast or lung.

• Industrial applications: Automated defect inspections of silicon wafers or electronic assemblies via image segmentation have found widespread use in the semiconductor and other industries.

• Retinal image processing: Interestingly enough, retinal image processing is an active topic of research, and segmentation plays an important role in two rather different camps. In biometric applications, various structures in the retina form a distinct signature along the lines of one’s fingerprint. Thus, extraction of this signature aids immensely in automated identification of individuals. Along different lines, object extraction of the retinal blood vessel tree, the optic nerve, and various other structures in the human retina, is performed as part of image-guided laser treatments and computer-aided diagnosis. See Figure 5-13 for an example.

It should be noted that universal and foolproof segmentation algorithms, guaranteed to work on all images, is not a feasible design goal. It is almost always the case that some set of heuristics or assumptions, based on specific knowledge of the problem domain, is by necessity incorporated into the overall process. Complex scenes are extraordinarily difficult to segment, and the goal should be to tune and tweak segmentation algorithms so that they work robustly and reliably for images one expects to encounter within a specific application.

Blood-vessel segmentation of retinal image (images courtesy Adam Hoover of Clemson University and Michael Goldbaum of UC San Diego), (a) Green channel of an RGB retina image, (b) Segmented image, produced using the method reported in [16], that shows just the blood vessels.

Figure 5-13. Blood-vessel segmentation of retinal image (images courtesy Adam Hoover of Clemson University and Michael Goldbaum of UC San Diego), (a) Green channel of an RGB retina image, (b) Segmented image, produced using the method reported in [16], that shows just the blood vessels.

 

Segmentation of an image with a strong bimodal distribution, (a) Unprocessed coins image, (b) Bimodal histogram, with the two strong peaks representing the background gray-level values and foreground gray-level values, respectively. The threshold value Ti serves to separate the two portions of the image, (c) Result of thresholding the image using a value of T,=180.

Figure 5-14. Segmentation of an image with a strong bimodal distribution, (a) Unprocessed coins image, (b) Bimodal histogram, with the two strong peaks representing the background gray-level values and foreground gray-level values, respectively. The threshold value Ti serves to separate the two portions of the image, (c) Result of thresholding the image using a value of T,=180.

Thresholding

Image thresholding, the process whereby all pixels in an image less than some value T are set to zero, or alternatively all pixels greater than some value T are set to zero, has previously been encountered numerous times in this topic. Gray-level thresholding is the simplest possible segmentation process, and formed an important part of the Sobel edge-detector that was the focus of 4.1. The general idea behind using a threshold to separate objects from a background is conceptually simple – consider the histogram of an image with light objects on a dark background, as shown in Figure 5- 14. The image in 5-14a consists of multiple coins with approximately the same pixel intensity differing from the intensity level of the background, which results in a bimodal histogram exhibiting two distinct peaks, or modes. One of these modes corresponds to the background area and the other to the objects-of-interest (in this case, the coins). Hence, it follows that a single threshold value divides the image into separate regions that are homogenous with respect to brightness, as shown in 5-14c where a value of Ti serves to discriminate the coins from the grayish background. While a single threshold value is sufficient to segment an image whose histogram is bimodal, as one would expect it is also quite possible to have histograms that are multi-modal in nature. Figure 5-15 illustrates the scenario for the case of a histogram exhibiting three modes, with two types of light objects on a dark background. In this case, the basic thresholding approach is extended to use two thresholds, so that for an image with a dark background, the two-level threshold segmentation scheme classifies a point in the image f(i,j) as either background, object class 1, or object class 2 according to the following criteria:

tmp17F-84_thumb

This scheme is known as multi-level thresholding and in general is far less reliable than using a single threshold value.

Autonomous Threshold Detection Algorithms

The central problem afflicting the Sobel edge detection routine of Algorithm 5-1 is still unresolved – namely, how to have a computer (or DSP) find a good threshold? If the image is guaranteed to be of high contrast, then simply selecting a brightness threshold somewhere within the middle of the dynamic range may be sufficient (i.e. 128 for 8-bit images). Obviously this is not always going to be the case and in this section we explore algorithms that derive a "good" threshold from the histogram of the image, where the goodness criterion is one where the number of falsely classified pixels is kept to a minimum. As was alluded to in the introduction to this section, using properties of the image that are known a priori can greatly aid in the selection of a good threshold value. For example, in the case of optical character recognition (OCR) applications, it may be known that text covers 1/p of the total canvas area. Thus it follows that the optimal algorithm for OCR is to select a threshold value such that 1/p of the image area has pixel intensities less than some threshold T (assuming the text is dark and the sheet is white), which is easily determined through inspection of the histogram. This method is known as p-tile-thresholding. Alternative techniques relying on histogram shape analysis are used when such knowledge is not available. One method that has been shown to work well under a large variety of image contrast conditions is the iterative isodata clustering algorithm of Ridler and Calvard[17]. The histogram is initially segmented into two sections starting with an initial threshold value T° such as 2bpp”, or half the dynamic range. The algorithm then proceeds as follows:

Segmentation of an image with a trimodal distribution, (a) Unprocessed bottlecaps image, (b) Trimodal histogram, with the two strong peaks representing both objects and a third corresponding to the background. The two threshold values Tj and T2 can be used to separate out both object types from the background, (c) Binary image resulting from thresholding the image using a value of Tt=125 and T2=175, that is all pixels outside the range [TrT2] are set to zero and the remaining non-zero pixels set to logical value of 1. (d) Result of thresholding the image using a value of T2=175, all pixels less than this gray-level intensity are set to zero and the remaining pixels set to logical value of 1.

Figure 5-15. Segmentation of an image with a trimodal distribution, (a) Unprocessed bottlecaps image, (b) Trimodal histogram, with the two strong peaks representing both objects and a third corresponding to the background. The two threshold values Tj and T2 can be used to separate out both object types from the background, (c) Binary image resulting from thresholding the image using a value of Tt=125 and T2=175, that is all pixels outside the range [TrT2] are set to zero and the remaining non-zero pixels set to logical value of 1. (d) Result of thresholding the image using a value of T2=175, all pixels less than this gray-level intensity are set to zero and the remaining pixels set to logical value of 1.

1. Compute sample mean of the pixel intensities of the foreground mf

2. Compute sample mean of the pixel intensities of the background mb

3. Set T’+l = (w/+ mi,) / 2

4. Terminate iteration if T’+l = T’, else go to step 1

Algorithm 5-2 is a complete image segmentation algorithm that estimates the background/object distinction using the isodata method to automatically determine the threshold value

Algorithm 5-2: Threshold detection using the isodata method.

INPUT: q-bit image I, length K of smoothing filter, max iterations N OUTPUT: threshold T,

Algorithm 5-2: Threshold detection using the isodata method.

 

 

 

 

Algorithm 5-2: Threshold detection using the isodata method.

Most threshold detection algorithms based on histogram shape analysis benefit from applying a smoothing filter to the histogram. Smoothing the histogram removes small fluctuations from the signal that tend to inject noise into the algorithm, and Algorithm 5-2 accomplishes this by computing a running average of the raw histogram. If H is a histogram with b – 2bpp bins then

tmp17F-88_thumb

smoothes the histogram and in effect, low-pass filters the one-dimensional signal H. In fact, the above expression describes a convolution in one-dimension of H and a box filter of length K, or in other words a weighted average or FIR filter. Figure 5-16 illustrates this algorithm’s effectiveness on a few different images, and in 5.2.5.1 we shall incorporate the isodata threshold detection routine into the Sobel Edge Detector to increase its practicality.

The triangle algorithm, attributed to Zack’181, is particularly effective when the image histogram exhibits a long tail with the foreground objects forming a rather weak peak, in comparison to a larger peak consisting of the background pixels. Conceptually, the triangle algorithm works in the following fashion, and is illustrated in diagrammatic form in Figure 5-17:

1. Fit a line between the peak of the histogram at bin bmax and the end of the longer tail of the histogram at bin bmin

2. Calculate the perpendicular distance db between this line and every point in the histogram between bmi„ and bmax

3. Set as the threshold the maximum dbi found

Examples of image segmentation using the isodata threshold detection algorithm. Top image is the raw, unprocessed image; middle plot is the histogram of this image; and bottom image is the binary segmented image. Images a and b courtesy of Professor Perona, Computational Vision at Caltech, http://www.vision.caltech.edu. (a) T = 120. (b)T= 177. (c)T= 143.

Figure 5-16. Examples of image segmentation using the isodata threshold detection algorithm. Top image is the raw, unprocessed image; middle plot is the histogram of this image; and bottom image is the binary segmented image. Images a and b courtesy of Professor Perona, Computational Vision at Caltech, http://www.vision.caltech.edu. (a) T = 120. (b)T= 177. (c)T= 143.

Single iteration of the triangle threshold detection algorithm.

Figure 5-17. Single iteration of the triangle threshold detection algorithm.

Adaptive Thresholding

In general, a major problem with global threshold segmentation techniques is that they rely on objects in the image being roughly of the same brightness level. In most real-world imaging environments gray-level variations are to be expected, and such variations tend to alter the histogram shape in a manner not amenable to global threshold techniques. One rather common source of trouble, especially in retinal image processing and face recognition systems, is that of non-uniform illumination, of which the effects on image thresholding are illustrated in Figure 5-18. There are means of compensating for this effect, and such techniques should be attempted prior to applying more sophisticated threshold detection algorithms that will be described shortly. One means of compensating for nonuniformity of illumination is to calibrate the system by taking a flat-field image and then using this image to normalize the captured image. This method of course assumes that the illumination pattern remains constant. If, on the other hand, the nonuniformity changes from image to image, it may be possible to measure or estimate the background illumination, which can then be subtracted from the image. There are a variety of other techniques, including gamma intensity correction, modified histogram equalization, and others specific to various applications that also find common use. Figure 5-19 is an example of such a method from the field of retinal image processing.

When none of these methods are available, adaptive thresholding can be used to handle variations in the histogram shape. The general approach is to divide the image into a set of non-overlapping subimages, and then determine a suitable threshold independently for each subimage. The segmentation occurs by processing each subimage separately, with respect to its local threshold. An example is shown in Figure 5-20.

While conceptually simple, adaptive thresholding in reality is not as straightforward as one might expect. For starters, great care must be taken to choose subimages of sufficient size, so that each histogram contains enough information to make a decent estimate of a meaningful threshold value. It still may be the case that for some of the subimages a threshold cannot be found (e.g. background portions of the image that are more or less constant).

Effect of uneven illumination on image histograms, (a) Test image with completely uniform illumination, (b) Histogram of test image, which could be easily segmented using a variety of automatic threshold detectors, (c) Test image with an uneven illumination field, (d) Histogram of test image with uneven illumination. The two peaks corresponding to the background and foreground have now blurred into one another, which would cause problems for image segmentation algorithms relying on a single, global threshold value.

Figure 5-18. Effect of uneven illumination on image histograms, (a) Test image with completely uniform illumination, (b) Histogram of test image, which could be easily segmented using a variety of automatic threshold detectors, (c) Test image with an uneven illumination field, (d) Histogram of test image with uneven illumination. The two peaks corresponding to the background and foreground have now blurred into one another, which would cause problems for image segmentation algorithms relying on a single, global threshold value.

Next post:

Previous post: