Interpolation Methods For Image Registration (Biomedical Image Analysis)

The transformation matrix M used to register two images generally requires access to image elements at noninteger coordinates. To transform an image, it is convenient to invert the translation matrix:

tmp20320_thumb

The reason to invert the matrix M is the more straightforward implementation of the transformation. In a triple loop over all integer coordinates x, y, and z of the transformed image (i.e., each point P/), the coordinate P; in the nontransformed image is computed through Equation (11.45) with M.-1 In the general case, the resulting point Pi contains noninteger elements.

Determination of the quality of registration by gray-level correlation metrics. Shown are the correlation (solid line) and the inertia (dashed line) of the co-occurrence matrix of the MR and CT images in Figure 11.6. The CT image was scaled up by 33% beforehand to match the size of the MR image. All three rigid-body transformations [horizontal translation (A), vertical translation (B), and rotation (C)] show a clear optimum when the images are in registration. Correlation shows a global maximum, and inertia shows a corresponding global minimum. Multidimensional search strategies can be employed to find the optimum points automatically.


FIGURE 11.8 Determination of the quality of registration by gray-level correlation metrics. Shown are the correlation (solid line) and the inertia (dashed line) of the co-occurrence matrix of the MR and CT images in Figure 11.6. The CT image was scaled up by 33% beforehand to match the size of the MR image. All three rigid-body transformations [horizontal translation (A), vertical translation (B), and rotation (C)] show a clear optimum when the images are in registration. Correlation shows a global maximum, and inertia shows a corresponding global minimum. Multidimensional search strategies can be employed to find the optimum points automatically.

An image can be assumed to be composed of pixel-sized tiles or cubes of constant gray value. Under this assumption, the image value at any point in space is known: namely, the image value of its nearest integer neighbor, a scheme called nearest-neighbor interpolation. Although it is computationally efficient, nearest-neighbor interpolation results are prone to noise and blocking artifacts. Particularly with repeated interpolation, the nearest-neighbor principle shows poor results. For purposes of interpolation, the image values are therefore assumed to be known only on integer coordinates, and some interpolation scheme provides estimates of the image value on noninteger coordinates. Linear interpolation provides a smooth transition of the gray value from one vertex to the next. Linear interpolation is arguably the most frequently used interpolation scheme in pixel-based image editing programs and many image processing software packages. Linear interpolation assumes that the image value changes linearly from one pixel to the next. In one dimension, the interpolated function value fintp (t) at a noninteger position of t can be obtained by

tmp20322_thumb

where L-J indicates the integer operation, and a = t — Lt J is the decimal part of t. In two dimensions, four neighbors are needed, and with the definition of a = x — Lx J and b = y — LyJ, the interpolation scheme to obtain the interpolated image value ^ntp (x,y) on noninteger coordinates x and y is a two-step process called bilinear interpolation:

tmp20323_thumb

In three dimensions, the process is identical, except that eight neighboring image values on integer coordinates are used (vertices of a cube around the noninteger point), and the interpolation uses four interpolations in x, two interpolations in y, and one interpolation in z (trilinear interpolation).* Whereas linear interpolation is computationally very efficient, the image value planes that are spanned between adjoining integer pixels are not differentiable over adjoining squares. Furthermore, linear interpolation acts like a lowpass filter.31 More sophisticated interpolation methods are recommended.

Cubic interpolation provides a differentiable hypersurface over adjoining squares or cubes. For cubic interpolation, a third-order polynomial is fitted into four consecutive data points in one dimension. In analogy to the weight factor a in Equation (11.46), four weight factors a0, a1, a2, and a3 are needed to obtain the interpolated value from the four closest neighbors. The weight factors in one dimension or in the where £ is the fractional part of the noninteger coordinate x. A common choice of the parameter a is a = 0.5. In one dimension (the x dimension), the interpolated value Ix,k is x-direction can be computed through

tmp20324_thumb

tmp20325_thumb

In one dimension, k is irrelevant. In two dimensions, Equation (11.49) needs to be applied four times for 1 < k < 4 and yi = |_yj — 1 to y4 = LyJ + 2. Next, a second set of weight factors a1 through a4 is computed through Equation (11.48) with £ = y — LyJ and the final interpolated value is obtained with

tmp20326_thumb

In three dimensions, the interpolated point is in the center region of a cube with 4 x 4 x 4 vertices. The first step involves the computation of 16 vertices interpolated in the x-direction, from which four vertices are interpolated in the y-direction, and the final point determined by interpolation in the z-direction. It can be seen that the computational effort for cubic interpolation is markedly higher than for linear interpolation.

Following Shannon’s sampling theory, a signal f(t) can be reconstructed from sampled values fk with a sinc interpolation function:

tmp20327_thumb

where sinc(t) = (sin t)/1. The sinc function has several desirable properties. It is unity at the origin and vanishes for all integer arguments. Extending Equation (11.51) to three dimensions, we obtain

tmp20328_thumb

where x, y, and z may be noninteger coordinates and i, j, and k are integer numbers that define the sampling coordinates. One difficulty associated with the sinc interpolation function is the infinite support. Theoretically, the summation takes place over an infinite volume, although the image occupies only a finite volume. In practice, a reasonable cutoff is chosen. However, this cutoff can still extend beyond the image boundaries, in which case suitable boundary conditions need to be applied, such as the mirror boundary condition, where, for example, I(x,y,z) = I(—x,y,z) when x < 0. Although sinc-based interpolation methods have been proposed for signal and image interpolation, particularly in combination with the Fourier transform,46 a sinc interpolator has a tendency to produce ringing, that is, high-frequency local oscillations.

It can be seen from Equation (11.51) that interpolation is actually a convolution operation. In fact, the interpolation of a discrete set fk with an interpolation function 9(t) can be expressed as a convolution operation defined by

tmp20329_thumb

If the interpolation function has finite support, the summation reduces to the range of the support of 9, thus reducing computational complexity. Notably, even the nearest-neighbor, linear, and cubic interpolation schemes can be expressed in terms of Equation (11.53) when the following interpolation functions are used:

tmp20330_thumb

In the cubic interpolation function, the parameter a < 0 determines the amount of negative "overshoot" near ±1 and can be used to approximate the sinc function. By inserting 9cub (t) from Equation (11.54) into Equation (11.53), the cubic interpolation in Equation (11.48) emerges. Whereas a = —0.5 is most commonly used, a cubic interpolation function with a = —0.75 closely approximates a windowed sinc function with the Hamming window and K = 5:

tmp20331_thumb

where a = 0.54 defines the Hamming window and K is the support. The windowed sinc function has been proposed as an alternative to Equation (11.51) to reduce ringing artifacts. The interpolation functions in Equations (11.54) and (11.55) are illustrated in

Some interpolation functions ^(t) as presented in Equations (11.54) and (11.55): (A) the nearest-neighbor function (gray) and the linear interpolation function (black); (B) the cubic interpolation function for the commonly used cases a = —0.5 (black) and a = —0.75 (gray); and (C) the sinc function (gray) together with the windowed sinc function [Equation (11.55)] for K = 5 (black). The sinc function is the only function with infinite support.

FIGURE 11.9 Some interpolation functions ^(t) as presented in Equations (11.54) and (11.55): (A) the nearest-neighbor function (gray) and the linear interpolation function (black); (B) the cubic interpolation function for the commonly used cases a = —0.5 (black) and a = —0.75 (gray); and (C) the sinc function (gray) together with the windowed sinc function [Equation (11.55)] for K = 5 (black). The sinc function is the only function with infinite support.

Figure 11.9. From the figure it can be seen that the lower-order interpolation functions have undesirable frequency responses. The nearest-neighbor function has a sinc-type Fourier transform (conversely, the sinc interpolation function has an ideal boxlike frequency response). A very detailed analysis of a number of interpolation functions was performed by Thevenaz et al.42 and the authors provided an efficient implementation in C for download on the Web at http://bigwww.epfl.ch/algorithms.html.

An alternative way to implement some transformations (i.e., the translation and rotation transformations)16 is to use the properties of the Fourier transform (topic 3). The three-dimensional discrete Fourier transform F of an image I(x,y,z) and its inverse transform F—1 are defined by

tmp20333_thumb

Although Equation (11.56) describes the discrete Fourier transform, all considerations in this section also apply to the fast Fourier transform. For this reason, implementations of rigid-body transforms in Fourier space are particularly efficient. A transformation of the type P’ = MP applied to all XYZ pixels of an image (thus the transformation of the original image I into an image I’) can be formulated as a filter F in the Fourier domain:

tmp20334_thumb

The filter function F for a translation bytmp20335_thumbcan be determined from Equation (11.56) by substituting the translated coordinates. This is derived as follows (note that the normalization 1/XYZ was omitted to simplify the equation):

tmp20336_thumb

The resulting filter function is a frequency-dependent phase shift. A translation can therefore be implemented by a Fourier transform of the original image I, followed by a multiplication of each voxel at frequency (u,v,w) in Fourier space by the complex filter function

tmp20337_thumb

followed by an inverse Fourier transform. Even with noninteger translation vectors, no interpolation is needed. The Fourier transform of an image rotated by a rotation matrix R is identical to the Fourier transform of the original image, rotated by R in the frequency domain. For this reason, implementation of a rotation filter in the Fourier domain is identical to the implementation in the spatial domain. However, the rotation operation requires interpolation in frequency space. To avoid this interpolation, the three-dimensional rotation matrix R can be represented by the product of four shear matrices13 as introduced in

tmp20338_thumb

Since a shear operation adds a nonconstant term to the exponential expression in Equation (11.56), a straightforward filter function as in Equation (11.38) is not feasible. Instead, the shear operation corresponds to one-dimensional coordinate shifts. Each one-dimensional row can be processed independently from all others,and the shear operation can be implemented as a sequence of one-dimensional Fourier transforms in all three spatial directions, followed by application of the coordinate-dependent shift filter for each transformed line. A similarly elegant solution for the scaling operation does not exist. The scaling theorem stipulates that a dilation by a factor a corresponds to a contraction along the frequency axis by a factor of a. However, the scaling theorem is valid only for the continuous Fourier transform.

Comparison of the performance of various interpolation schemes. The original image (A) is a variation of the Shepp-Logan head phantom with enhanced contrast and some Gaussian noise added to a small region. Underneath is the result of repeated transformations with nearest-neighbor interpolation. Images B and C are the result of the same transformation sequence with linear and cubic interpolation, respectively. Underneath are the differences to the original image A. Image D is the result of the transformations with the interpolation proposed by Thevenaz et al.42 and a fifth-order spline. The difference image illustrates that this interpolation scheme causes the smallest deviations from the original image.

FIGURE 11.10 Comparison of the performance of various interpolation schemes. The original image (A) is a variation of the Shepp-Logan head phantom with enhanced contrast and some Gaussian noise added to a small region. Underneath is the result of repeated transformations with nearest-neighbor interpolation. Images B and C are the result of the same transformation sequence with linear and cubic interpolation, respectively. Underneath are the differences to the original image A. Image D is the result of the transformations with the interpolation proposed by Thevenaz et al.42 and a fifth-order spline. The difference image illustrates that this interpolation scheme causes the smallest deviations from the original image.

The importance of selecting a suitable interpolation method is demonstrated in Figure 11.10. In this example, a variation of the Shepp-Logan head phantom with enhanced contrast and added Gaussian noise in the right "ventricle" was subjected to multiple transformations: scaling by 0.83 followed by scaling by 1.7, followed by 15 rotations by 24°, followed by rescaling to its original size. Figure 11.10A shows the original phantom before interpolation, and the result of the transformations using nearest-neighbor interpolation underneath. Figure 11.10B and C show the result of the transformations with the linear and cubic interpolation, respectively, and the difference to the original image is shown underneath. Figure 11.10D was computed by using the interpolation scheme proposed by Thevenaz et al.42 with a fifth-order spline. Nearest-neighbor interpolation leads to a jagged appearance, while both linear and cubic interpolation schemes cause a significant amount of blurring. The least loss of detail can be seen in the spline interpolation scheme in Figure 11.10D. However, some ringing and overshoot beyond the original image value range was observed. Relative to the linear interpolation scheme, nearest-neighbor interpolation is 45% faster, cubic interpolation takes 44% more time, and spline interpolation takes 144% more time. The root-mean-squared coefficient of variation (root-mean-squared distance normalized by intensity average) between the transformed image and the original image is 15.7% for the nearest-neighbor interpolation, 10.2% for linear interpolation, 6.6% for cubic interpolation, and 5.1% for spline interpolation. Cubic interpolation is often considered an acceptable balance between computational effort and interpolation quality, but with more powerful computers, a shift toward more sophisticated interpolation schemes can be expected.

Next post:

Previous post: