Image Enhancement Via Spatial Filtering (Image Processing) Part 2

Fast Convolution in the Frequency Domain

Two-dimensional convolution is an expensive operation, and some image processing techniques call for repeatedly passing an image through a filter, such as the unsharp masking example of Figure 4-6. Even with efficient code running on architectures explicitly designed for fast digital filtering, such as DSPs, as the image and kernel sizes increase the price of operating purely in the time-domain eventually becomes prohibitive.

Unsharp mask on moon image (courtesy of the National Space Science Data Center), (a) Original image, (b) Processed image, using a 20x20 Gaussian low-pass filter (a=l), with a= 1.5 and (5=0.5.

Figure 4-6. Unsharp mask on moon image (courtesy of the National Space Science Data Center), (a) Original image, (b) Processed image, using a 20×20 Gaussian low-pass filter (a=l), with a= 1.5 and (5=0.5.

At this point, it is advantageous to migrate the process over to the "frequency domain", which necessitates incorporating the Fourier Transform and its digital equivalent, the Discrete Fourier Transform (DFT). The DFT is calculated highly efficiently not through direct evaluation of its mathematical formulation, but rather through a well-known algorithm known as the Fast Fourier Transform (FFT). The details behind the Fourier Transform, the DFT, and how the FFT performs its magic are beyond the scope of this topic. For our purposes, it is sufficient to say that 2D convolution can be performed in the frequency domain by exploiting one of the central tenets of signal processing theory, the convolution theorem, which states that convolution in the time domain is equivalent to multiplication in the frequency domain.


Figure 4-7 show how to perform 2D convolution in the frequency domain. Both the image and kernel are transformed using the 2D FFT, which in general produces two complex numbered matrices containing the discrete Fourier coefficients for both the image and filter kernel. Because the kernel and image matrices are probably not of the same size, prior to applying the 2D FFT the smaller of the two is expanded through an operation known as zero-padding, where zeros are appended to the matrix (see Figure 4-7b). The coefficients are multiplied together and then the inverse transform is performed, thereby sending the result back to the time domain. The result of the inverse transform is the convolved image, or equivalently, the result of passing the original image through the filter defined by the kernel.

2D linear convolution in the frequency domain, (a) I is an image matrix of size MxM, H is the NxN filter kernel matrix, (b) I and H are zero-padded such that they are of size (M+N-l)x(M+N-l), and the 2D FFT is performed on both augmented matrices, (c) FFT matrices are multiplied together, in an element-wise fashion (not matrix multiplication), (d) The result of the multiplication is transformed back to the time domain via the 2D inverse FFT.

Figure 4-7. 2D linear convolution in the frequency domain, (a) I is an image matrix of size MxM, H is the NxN filter kernel matrix, (b) I and H are zero-padded such that they are of size (M+N-l)x(M+N-l), and the 2D FFT is performed on both augmented matrices, (c) FFT matrices are multiplied together, in an element-wise fashion (not matrix multiplication), (d) The result of the multiplication is transformed back to the time domain via the 2D inverse FFT.

It should be emphasized that 2D convolution in the frequency domain is not necessarily faster than direct application of the 2D convolution equation in the time domain. Typically, performing the operation in the manner shown in Figure 4-7 is faster when large kernel sizes are used with large images, the exact sizes being dependent on the speed of the FFT routine and the particular architecture on which the code is running on. Eventually, the cost of transforming two matrices to the frequency domain via the 2D FFT and one matrix back to the time domain via the 2D inverse FFT is balanced out by the fact that in between these two transforms, element-wise multiplication occurs, as opposed to sliding a mask across an image and computing the sum of products at each step. However, this means of performing filtering in the frequency domain only works for linear filters. Non-linear filters are precluded from taking advantage of this scheme, as the convolution theorem does not apply to them. A general rule of thumb is that linear convolution in the frequency domain should be considered if:

tmp6-152_thumb[2]

where the filter kernel is of size NixN2 and the image is of size MixM2.

Implementation Issues

Algorithm 4-1 is the pseudo-code for linear filtering of images in the time domain, that is, via direct application of the 2D convolution equation.

Algorithm 4-1: 2D Convolution

INPUT: MxN image I, NHxNH kernel H, where NH is odd OUTPUT: filtered image J

Algorithm 4-1: 2D Convolution

 

 

 

 

 

Algorithm 4-1: 2D Convolution

Algorithm 4-1 brings up a few practical considerations that any image filtering implementation must take into account, namely what to do about edge pixels. Figure 4-8 illustrates the situation for a 5×5 kernel situated at the top-left corner of an image. The neighborhood for this pixel, and all others centered about a pixel in the first and last two rows or columns have to be handled in a different manner than the interior region of the image.

There are a few different ways in which to handle the edge pixels. One is to pad the image on both sides with zeros, and another is to simply extend the image by replicating the first and last row or column however many times is needed (as dictated by the half-width of the kernel). Both of these solutions have their drawbacks, especially the first as the overall intensity levels of the outer regions of the processed image will be less than their neighbors.

Edge effects issues with 2D convolution, (a) With the neighborhood centered around the upper-left pixel, the mask extends out past the boundaries of the image by the half-width of the kernel (in this case 2). (b) By the time the neighborhood has shifted down to the 3rd row and 3rd column, the entire mask tiles completely over a valid portion of the image.

Figure 4-8. Edge effects issues with 2D convolution, (a) With the neighborhood centered around the upper-left pixel, the mask extends out past the boundaries of the image by the half-width of the kernel (in this case 2). (b) By the time the neighborhood has shifted down to the 3rd row and 3rd column, the entire mask tiles completely over a valid portion of the image.

Most of the time the edge pixels are simply ignored and set to zero, and this is the solution specified in Algorithm 4-1 as well as the C code that follows in this topic. When performing 2D convolution in the frequency domain, edge artifacts will be present as the FFT assumes a periodic signal and so in essence, the image is extended such that it wraps around at all four of its edges (i.e. the right-most column is contiguous with the left-most column, the bottom-most row is contiguous with the top-most row, and so on).

Next post:

Previous post: