Edge enhancement and edge detection are very important operations in image processing. Many edge detection operators were devised early in the age of computerized image processing, and most build on finite differences. In all cases, special provisions are taken to ensure that edges are enhanced irrespective of their orientation. One of the earliest edge detectors is Roberts’ cross,21 a finite-difference operator that computes two gradients, G1 and G2, in perpendicular directions by using forward differences:
The square roots in Equation (2.18) are intended to approximate a gamma-style contrast enhancement for the darker tones.21 The two gradient pixels were then combined with the final gradientSince the computation of a square root was a very computationally expensive operation, alternative formulations used the maximum of the absolute values of G1 and G2 or the sum of the absolute values of G1 and G2. These formulations made Roberts’ cross dependent on the orientation of the edge. Furthermore, this operator is very sensitive to noise, and the use of discrete differences with a single-pixel distance shifted the edge by one-half a pixel. By using central differences, the pixel shift is eliminated and the noise sensitivity is somewhat reduced, because the central difference is the average of the forward and backward differences, as can be demonstrated in one dimension as:
The use of central differences leads directly to the Sobel, Prewitt, compass, and Kirsch operators, which are related. Prewitt19 and Sobel24 proposed averaging neighboring values perpendicular to the direction of the central difference and suggested a convolution with the gradient detector kernels GX and GY to obtain convolution results with emphasized edges of horizontal and vertical orientation, respectively:
where a = 1 for the Prewitt operator and a = 2 for the Sobel operator. For edge enhancement, the original image is convolved with GX and GY separately to yield an image IX, where horizontally oriented edges are enhanced, and an image IY, where vertically oriented edges are enhanced. The final gradient image G(x,y) is computed throughTo compute the exact edge magnitude, a
normalization factor for GX and GY of 6 for the Prewitt operator and 1 for the Sobel operator is needed, but in practical implementations, this factor is rarely found.
Each Sobel kernel can be interpreted as a rough approximation of the first derivative of a Gaussian function in one direction. GX is a derivative kernel that detects discontinuities in the vertical direction (i.e., horizontal edges), and GY is a derivative that detects discontinuities in the horizontal direction. Analogous to the Gaussian kernel [Equations (2.11) and (2.12)], larger kernel sizes for edge detection are possible. For example, the partial derivative of the two-dimensional Gaussian function g(x,y) toward x, can be used to obtain a 7 x 7 kernel that approximates the values for Equation (2.21) with ct = 2:
Such a kernel combines the derivative operation with a smoothing operation and therefore reduces noise in the final edge image. The corresponding partial derivative toward y (to detect horizontal edges) is obtained by rotating the matrix in Equation (2.22) by 90°. The combination of smoothing and derivative operations can again be explained by the linearity of the convolution operation. Analogous to Equation (2.13), the derivative operator can be moved in front of the convolution operation,
which indicates that the convolution of an image I with the partial first derivative of a Gaussian function is identical to computing the first derivative of the same image I convolved (smoothed) with the original Gaussian function. Extending this consideration to the second derivative leads to the Laplacian-of-Gaussian operator.
The Kirsch operator14 and the related compass operator make use of the same principle as the Sobel operator, but instead of combining two perpendicular directions, the Kirsch and compass operators have eight kernels, K1 through K8, to define all
Each of the kernels has its elements rotated 45° from the preceding one, and the next four kernels have the opposite sign: K5 = -K1, K6 = -K2, K7 = -K3, and K8 = -K4. In a similar manner, the Kirsch operator14 makes use of a different definition of kernels K1 through K4:
The idea behind the Kirsch and compass operators is to avoid the computationally expensive square-root computation by accepting a larger number of convolutions, which is faster overall when integer arithtmetic is used. For each pixel, eight convolution values are computed (only four by actual convolution and four by sign change), and the maximum of the eight values is taken as the new image value.
A comprehensive approach to edge detection was developed by Canny,2 who combined Gaussian smoothing, finite-difference edge detection, and nonmaximum suppression into his edge detection algorithm. Nonmaximum suppression is a component of the algorithm that removes pixels from the set of edge points when they either fall below a certain, selectable threshold value, or when the gradient magnitude in perpendicular directions is similar, or when a neighborhood pixel has the same edge direction but a higher edge magnitude. As a consequence, the edges are thinned. Finally, Canny proposed the use of hysteresis thresholding, which makes use of local connectedness (see Section 2.4) and removes isolated sets of low and intermediate edge magnitude. This comprehensive approach is comparatively costly in terms of computational effort, and Canny has suggested several efficiency improvements for the convolution, difference, and nonmaximum suppression stages.3 In addition, De-riche developed a recursive filter implementation6 that advertises itself for use in fast integer-based implementations and for implementations in hardware filters or digital signal processors.
Another comprehensive approach for edge detection was proposed by Frei and Chen.7 By convolving the image with nine different kernels that form an orthogonal eight possible edge directions in a 3 x 3 neighborhood. The compass operator uses the same kernel structure as the Sobel operator. The first four kernels are:
It can be seen that kernels F1 and F2 are typical edge detector kernels for horizontal and vertical edges. Kernels F3 and F4 are designed to amplify ripples, and together with F1 and F2 they comprise the edge space. Kernels F5 and F6 are sensitive to lines of single-pixel width and diagonal or horizontal/vertical orientation. Finally, kernels F7 and F8 are approximations of discrete Laplacian functions. The four-dimensional subspace created by convolution kernels F5 through F8 is called the line space. Let B = [b0, b1,..., b8] be the feature vector for one pixel of the image, where the components b0 through b8 have been generated by convolution of the image with kernels F0 through F8. The vector can be projected into edge space or line space:
where 0E is the projection angle into edge space and 0L is the projection angle into line space. A low projection angle 0E indicates a high probability that the associated pixel is part of an edge. Conversely, a projection angle of 0E close to 90° indicates a low probability of the pixel being part of an edge. For lines, 0L carries similar information. The projection of additional subspaces is possible in a similar manner. For example, Frei and Chen demonstrated the effects of the projection of "ripple space" and "Laplacian space,"7 but these projections are used infrequently.
In addition to the edge magnitude, direction-dependent filter masks can provide the edge direction. In the case of the Kirsch and compass kernels, the kernel that produces the highest convolved pixel value determines the edge direction: If convolution with system, each pixel is assigned a nine-dimensional feature vector. The nine convolution kernels are:
FIGURE 2.5 Eight possible edge orientations that can be detected with the compass operator [Equation (2.24)]. White indicates a higher image value than black, and the edge direction is defined so that the higher value is to the right of the direction arrow.
K1 in Equation (2.24) produces the highest pixel value, the edge is horizontal; for K2 it is diagonal (southwest to northeast), for K3 it is vertical, and for K4 it is diagonal northwest to southeast. K5 to K8 act in a similar manner, but the side of the higher image value is mirrored. Figure 2.5 illustrates the eight possible directions that can be detected with the compass operator in Equation (2.24). The Sobel and Prewitt operators provide edge directions as well. In this case, the arctangent function can be used:
where atan2 is the four-quadrant arctangent function as implemented in many programming languages, and Ix and Iy are the intermediate convolution results of the image with GX and GY as defined in Equation (2.20). When using Equation (2.28), the edge direction is continuous, as opposed to the eight discrete directions obtained from the Kirsch and compass operators.
Images that underwent edge enhancement usually need additional processing steps to extract clean edges. Any edge enhancement operation is highly noise-sensitive, and image noise may produce many pixels of high magnitude that are not part of an actual edge. These processing steps may include histogram analysis and thresholding to retain only the most prominent edges, morphological operators (such as elimination of isolated pixels or median-axis thinning), or feature detection such as straight-line fitting as proposed by Roberts21 or the Hough transform (topic 7).
In the context of image enhancement, the median filter needs to be introduced. The median filter is a powerful nonlinear noise-reduction filter for noise that is not Gaussian. Convolution with a Gaussian kernel (smoothing operation) is most suitable for reducing additive noise with a Gaussian distribution and—more important—zero mean value. Other types of noise, particularly salt-and-pepper noise (also called shot noise, i.e., each pixel has a certain probability of being either black or white), cannot be filtered well with Gaussian filters because extreme values of salt-and-pepper noise get "smeared" over the neighborhood. A median filter is ideally suitable to remove runaway values. A median filter acts on a n x n neighborhood but is not convolution-based. Rather, all image values in the neighborhood are sorted ascendingly, and the central pixel is replaced by the median value of the neighborhood pixels. Intuitively, the median filter replaces the central pixel by a value that is more typical for the neighborhood (namely, the median value) and thus eliminates pixels with runaway intensities. Since a median filter has a very high probability of changing the value of the central pixel, attenuated versions exist where the value of the center pixel is repeated k times in the sorted list for median determination.15 Such a center-weighted median filter has a higher probability of not changing the pixel value. A more in-depth analysis of the median filter and variations that adapt to local image properties to better preserve the image values is given in Section 5.1. The effect of the conventional median filter and center-weighted median filter on salt-and-pepper noise is compared with Gaussian blurring in Figure 2.6.
The effects of various convolution-based filters are shown in Figures 2.7 and 2.8. In Figure 2.7, various lowpass and highpass operations are demonstrated, and in Figure 2.8, edge detection by the Sobel and compass operators with color-coded edge direction, edge detection with the LoG operator, and edge and line detection with the Frei-Chen operator are shown.
FIGURE 2.6 Effect of median filtering on salt-and-pepper noise. (A) An image spoiled with 5% salt-and-pepper noise where each pixel has a 2.5% probability of containing either a black or a white noise pixel. Gaussian blurring (B) is unsuitable for noise reduction because the extreme values are merely blurred over the neighborhood. The median filter almost completely eliminates the salt-and-pepper noise (C), but it also attenuates texture on homogeneous regions (compare to Figure 2.7A). A center-weighted median filter is less effective in removing a strong noise component but has better texture-preserving properties.
FIGURE 2.7 Demonstration of some convolution filters. The original image A is a CT slice of the chest and is 500 x 350 pixels in size. Image B was convolved with a Gaussian kernel with ct = 2.1 in a 13 x 13 neighborhood. Images C and D show the effect of sharpening: In image C the weaker sharpening mask of Equation (2.17) was used, and in image D, Equation (2.16) was applied. Image E is the result of a convolution with the Laplacian mask in Equation (2.15). Finally, image F demonstrates the effect of the DoG operator where the original image, convolved with a Gaussian kernel with ct = 4, was subtracted from the original image convolved with a Gaussian kernel with ct = 1.6. The result closely resembles the LoG operator (Figure 2.8C).