Digital Imaging Enhancement

Introduction

In the overview article on photography and digital imaging, Lena Klasen stressed the importance that images and imagery have played in forensic science. In recent years the importance of imagery has been underlined by the rapid increase in the number of closed-circuit television security and surveillance systems that have been deployed, not only in high-risk installations, such as banks, retail outlets and key government establishments, but also in city and town centers. In whatever field of endeavor, the key to the successful use of imagery is the ability of trained imagery analysts to extract useful data from the imagery. The initial extraction of data is, of course, through the eyes, but these are often aided by enhancement devices such as spectacles and optical lenses, and so on. However, it is the human brain that provides the object recognition and the analysis of what part the object plays in the scene. The analyzed data are then compiled with other data to provide useful ‘imagery intelligence’.
The brain’s capability to interpret images can be enhanced by training to provide better understanding of the way in which energy is reflected from an object, recorded by the imaging system and subsequently released as a picture for the eye to observe. This process is more complicated than it at first seems, and without imagery analysis training observers often come to the wrongcon clusions. However, even experienced imagery analysts sometimes need electronic help. There are two main factors involved here. First, our eyes (as remarkable as they are) are far less sensitive than the optical and electronic devices that record and replay the imagery; second, a large part of the imagery available to forensic imagery analysts is of relatively poor quality. The fact is that there is information in the imagery that the eyes cannot ‘see’ and it is thus necessary to enhance the picture to the point where these data come within the accommodation of the human eye. Because this topic covers forensic work, this article first briefly covers some of the legal constraints surrounding digital imagery in general and digital enhancement in particular.

Imagery Enhancement: Legal Constraints Digital and optical enhancement

Although historically it was initially seen as a dubious new science, photography has been accepted in the courts for many years. Photography is understood by most people in the court and a photograph or negative is easily seen to be first evidence. Video too has gained rapid acceptance and is wrongly seen as moving photographs, albeit of poorer resolution. However, digital imagery, and in particular digital enhancement, is viewed with suspicion by many courts of law and has not yet been fully accepted. The effects, frequently demonstrated in the cinema and on television, of both optical and digital enhancements are impressive, convincing and exciting but they are effects that are known to be often overwhelming falsehoods. Thus, it is essential that automatic safeguards are in place before enhancements are accepted for forensic purposes.
Nothing must be extracted or, more importantly, introduced into the original imagery record, be it optical or digital. The original is the first evidence and must remain intact and unsullied. Any subsequent enhancements can be compared with this original to test their authenticity as evidence and their probative value. Moreover, all specific enhancements produced for the courts must be processed on equipment that has an audit trail, so that the results can be replicated by other forensic imagery analysts.
Many of the enhancements that can be produced by digital means could be done using analog optical equipment but optical methods are both time consuming and costly. Digital enhancement therefore predominates, and will continue to do so.

Data compression

There are a vast amount of data in a single picture. Eachpicture element within a digital image is known as a pixel, and even in a simple scene there can be many millions of pixels. The exact number will depend upon the resolution of the recording apparatus and the quality of the electronic peripherals that support it. Because of this vast amount of data, there is a need to compress the data for many stages of the process, starting from scene being imaged and recorded and on through to the final picture to be analyzed.
Data compression is described in detail elsewhere. Among other things, its use has the enhancement advantage of reducing unwanted noise. There are many data compression systems and algorithms, and hybrid versions appear almost daily, but the fundamental principle that must be observed in forensic work is that the system used must not be predictive. To provide industry standards, a group of photographic experts set up a committee to lay down a basic standard and to judge the merits of various algorithms. This decompression standard was called JPEG (Joint Photographic Experts Group Image Compression Format). The algorithms for JPEG are effectively ‘lossless’ because all of the data on each field is sampled. Thus the system is nonpredictive.
The JPEG system, with its large data sampling rate, proved to be cumbersome and unwieldy for moving video pictures. Thus a system called MPEG (Moving Picture Expert Group) was devised. MPEG samples the data in each incoming picture less frequently. It recreates an intermediate frame (picture) by a prediction process. Basic MPEG is thus not suitable for forensic use. However, subsequent developments (called variously MPEG2 and JMPEG) have built-in safeguards and many are now suitable for forensic work. The key is that they must not be predictive in their nature. In simple (and possibly oversimple) terms, the system must not guess (predict) and then insert this predicted data into the reconstructed frame.
Even though a compression system is nonpredictive it does not make it automatically suitable for forensic work. The effectiveness, in terms of the ability to compress and subsequently reproduce a picture that is as near to the original as possible, is inter alia dependent upon additional factors. The refresh rate of ‘I’ frame data from an original uncompressed frame must in any event be high, in order to prevent data drift and more and more errors occurring.
There are other elements involved in compression but these can mostly be considered as subsets of the base algorithms designed to increase the efficiency. Wavelets is but one example; there are many others. Wavelets was devised in the 1940s before the complexities of digital Fourier transforms were fully developed. Thus Wavelets has been a successful part of decompression for many years but it is important to restress that these and any other subsets are only suitable for forensic imagery work if the overall system is fundamentally nonpredictive.

Authenticity of digital images

Superimposed upon any decompression or recording system is the need to guarantee the authenticity of the imagery. Strict rules are laid down in most justice systems on the ownership and handling of imagery data. The history of any digital imagery evidence must be available to the court and authenticated by each person in the procedural chain.
Digital imagery, unlike analog imagery, does not have a clear and obvious original. With a conventional film camera the negative is the original and the untouched negative would thus be the first evidence in court. However, withimagery that is recorded digitally, the stream of data is passed to a processor and it is a computer chip that recovers the data initially and stores to a hard disk. In essence, therefore, it is the hard disk that is the original. Nevertheless, a great advantage of digital storage and enhancement is that all copies are replicas and are thus indistinguishable from the original. Analog systems, on the other hand, produce copies with consequent data loss, albeit normally quite low and not noticeable to the naked eye.
In regard to digital systems, the alternative to handing the court a negative is to have special authentication for the original. Two examples of how this can be done are watermarking and encryption.
Watermarking An electronic watermark can be imposed within the frame data. Thus picture ownership can be authenticated. The main problem is that, since the watermark will always persist even after enhancement and other manipulation, it would be a target for ‘hackers’.
Encryption Modern digital systems have sophisticated encoding that make interception of the signal difficult and effectively prevent anyone from reproducing an unscrambled and thus meaningful picture. Of course, without this base picture anyone with ‘Machiavellian’ intentions could not manipulate the imagery. Tampering with the picture is thus impossible without access to the encryption keys. It is, however, important to ensure that encryption is integrated early enough in the process to preclude signal alteration before the initial recording is completed.

Imagery Enhancement Techniques

As established above, enhancement is necessary to reveal to the imagery analyst data that would, in their raw recorded state, be invisible or imperceptible to the human eye. The results of any enhancement are dependent upon the quality of the original imagery. The nature and quality of the image will largely dictate the type of enhancement tools used. A typical enhancement menu on a modern workstation will have facilities to enlarge, sharpen, contrast-stretch, blur, smooth and detect edges. It is not be possible to cover all of the filters in the space available: a representative few are discussed below.

Enlargement

There are several methods of digital enlargement. Four examples are pixel expansion, near or natural neighbor, bilinear interpolation and cubic convolution.

Pixel enlargement

This is the simplest method of enlargement. The area of the picture accommodated by each single pixel is made larger. The problem with pixel enlargement is that the picture becomes a series of squares, which disguises the true nature of the scene after relatively few pixel expansions. When dealing with images with a great density of pixels giving very high resolution, the pixels can be expanded further, but even then the breakthrough of squares limits the degree of enlargement. An example of pixel expansion is shown in Figs 1 and 2.
Figure. 1 is an aerial photograph of a collection of farm buildings. The area is typical of a rural scene of crime. At this scale it is difficult for the eye to see individual buildings. Optical enlargers could be used but, as the picture has been scanned, digital enlargement is also an option. Figure 2 is an example of pixel expansion. The buildings are larger but the interpret-ability of the picture is marred by the presence of the pixel squares. This is called pixel breakthrough. The greater the expansion, the more distorted become the features in the picture. This fact is used in some television broadcasts when the face of an individual must not be recognized for legal reasons and is deliberately subjected to pixel breakthrough. It is quite an effective method of concealment but not very safe because an enterprising technician could manipulate the pixels and recover the facial details. Thus, more sophisticated methods of blanking facial data are generally used.

Figure 1 Aerial photograph of a collection of farm buildings.

Near neighbor or natural neighbor

Some but not all of the pixel distortion can be lessened by using near neighbor interpolation. This is sometimes called natural neighbor. The interpolation uses the pixel closest to the pixel location of interest. This smoothes the most dramatic impact of pixel expansion, and in a sense averages out the distortion. However, even using near neighbor algorithms, pixel breakthrough distortion is still a problem when the expansion factor is large.

Figure 2 Pixel expansion of farm buildings shown in Fig. 1.

Bilinear interpolation

Bilinear interpolation algorithms are based on a weighted average of the four pixels surrounding the pixel location of interest. With this system of enlargement the pixel breakthrough is all but completely eliminated.

Cubic convolution

Cubic convolution is an older and more complex interpolation algorithm than bilinear interpolation. However, it too is based on a special averaging of the pixels surrounding the pixel location of interest. With cubic convolution, pixel breakthrough is eliminated and the system maintains the integrity of the object with respect to its background, irrespective of the degree of enlargement. The limitations are therefore the same as those of optical enlargement lens systems, i.e. the degree of enlargement is limited only by the resolution of the picture.
Figure 3 is an enlargement of the same farm buildings seen in Figs 1 and 2. This time the enlargement has been achieved using cubic convolution and there is a clear absence of pixel breakthrough.
Figure 4 is a scanned oblique photograph typical of the sort of picture taken from a police helicopter. It shows the same farm. Figure 5 is an enlargement of the farm tractor and trailer that was located in the centre of Fig. 4. Cubic convolution has also been used for this enlargement and once again there is a clear absence of pixel breakthrough.
Figure 6 is yet further enlargement of the tractor tire area and it still shows no pixel breakthrough, although the resolution limits of the picture become clear.

Figure 3 Enlargement of farm buildings (Fig. 1) using cubic convolution.

Figure 4 Scanned oblique photograph of farm buildings shown in Fig. 1.

Sharpening

There are several filters that have the role of sharpening the image. These include edge detection and edge sharpening.

Edge detection

There are numerous filters that enable edge detection. This is essential for edge sharpening and other enhancement processes. Among the main filters are the series of Laplacian edge detectors. These filters produce sharp edge definition and can be used to enhance edges with both positive and negative brightness slopes. All types of Laplacian filters work on a weighting system for the value of the pixels surrounding the pixel of interest. In a 3 x 3 filter the system diagnoses the eight pixels around the main pixel. The system weights to maintain a balance where the sum of all the weights equals zero. On or near the edges of the image, the values of edge pixels are replicated to provide enough data. Another edge detector is the Sobel filter. This too is an omnidirectional spatial edge enhancement filter and also uses a 3 x 3 matrix to calculate the ‘Sobel gradient’.

Figure 5 Enlargement, using cubic convolution, of tractor and trailer shown in Fig. 4.

Figure 6 Further enlargement of tractor tires shown in Fig. 5.
Figure 7 is a well-defined image of a fingerprint. Resolution and coverage of this sort would create no problems for the fingerprint expert. The introduction of a Sobel edge filter produces even clearer edges and gives the impression of a three-dimensional surface (Fig. 8).

Edge sharpening

These filters use a subtractive smoothing methodology. An average spatial filter is applied which retains the frequency data but reduces high frequency edges and lines. The averaged image is subtracted from the original image to leave the edges and linear features intact. Once the edges are identified in this way, the difference image is added to the original. This method provides clearer edges and linear features but has the disadvantage that any system noise is also enhanced. An example of edge detection and sharpening can be seen in Fig. 9 which shows two pictures of the same farm buildings. The one on the left is the enlargement by cubic convolution. The one on the right has undergone edge detection and edge sharpening. This does not make the picture esthetically more attractive but improves its interpretability.

Figure 7 Fingerprint image.

Figure 8 Fingerprint (shown in Fig. 7) reproduced using edge filter.

Contrast Stretching

It is frequently necessary to stretch the contrast of a picture. The most important reason for this is that a picture often contains more data than the eye can accommodate. A full gray-scale monochromatic photograph will have up to 256 gray scales; however, the human eye can only ‘see’ up to 26 of these. Contrast stretching will enhance subtle differences between the object and its background by bringing the contrast difference into the accommodation range of the human eye. Contrast stretching can also aid the eye when analyzing color imagery. Human perception is vastly improved when color is introduced. It is said that the eyes can ‘see’ up to many millions of tones of color. It is thus often advantageous to portray a monochromatic scene in color. This is usually false color but if the 256 gray scales in a black and white photograph are changed to color then there is very little difficulty in seeing the subtle gray scale differences.

Figure 9 Farm buildings shown in Fig. 1: enlargement by cubic convolution (left); result of edge detection and edge sharpening (right).
In Fig. 10 the shapes of, for example, the materials in the trailer are difficult to see. However, a simple false color treatment, as seen in Fig. 11, reveals the shapes more clearly. In general when more colors are introduced the shapes will become clearer (within the resolution limits of the original imagery).
Contrast variations can be controlled in many ways but for most PC-based systems the mouse is the most accommodating device. Variations can be made either throughout the picture or in specific parts. In Fig. 12 part of a fingerprint cannot be seen clearly, while the other part is usable. By applying contrast stretchfiltration and mouse manipulation to the area selected, the hitherto unclear fingerprint is revealed (Fig. 13). Enlargement further assists the eye.

Contrast filters

There are a broad range of filters that will provide contrast stretching. Many of these use histogram operations. Histogram operations can tone down the brightness response curve. This has the effect of altering the distribution of contrast within the spectrum of dark to bright pixels.

Figure 10 The materials in the trailer are difficult to see.

Figure 11 Color treatment reveals the shapes in Fig. 10.

Figure 12 Part of this fingerprint is usable but part cannot be clearly seen.
Histogram operations include histogram stretch. This operation expands the response curve in a linear manner. Contrast is thus spread evenly throughout the picture or area of interest. The bright ranges are not competing against one another but images with large areas of dark and large areas of bright pixels are not strongly affected.
Another histogram operation provides histogram equalization. This operation modifies the response curve nonlinearly. The total pixel dynamic range is used in a balanced and uniform manner. This emphasizes the contrast in the bright areas, as in these areas the contrast is increased the most.
Histogram operations also allow the operator to change from a positive image to a negative image. In complex and difficult images the eye can sometimes appreciate contrast better in the negative domain, and vice versa.
Color contrast can be modified by filters that impact upon the tonal transfer curve. There is normally a set of curves for each family of colors. The curves eachallocate a shade of gray in images with particular intensities. Once the tonal transfer function is applied, the images are displayed with a specific color contrast.

Figure 13 Contrast stretch filtration, manipulation and enlargement enhances the image shown in Fig. 12.

Blur

Sometimes it is advantageous to reduce the amount of detail in order to better appreciate the overall picture. In general, forensic imagery analysts take a complex scene and simplify it so that particular information only is seen. In this they are rather like thematic mapmakers who put on to a map only that which is helpful to a particular subject or theme. This is to make the data more digestible. For example, if a person is provided witha photographor series of photographs of New York State and asked to navigate in a car from New York City to Albany on minor roads, he or she would find the task very daunting. This is because there is too much detail in the photographs to make them suitable for the job. However, if provided with a road map on which there is only the basic data required, the task becomes easy. Thus it is that in some cases data on a photograph have to be minimized. Blur filters can be used to produce the desired effect.
A blur filter is a form of edge smoothing. Its use reduces the prominence of high contrast spots and edges. It can therefore reduce clutter, which would otherwise cause distraction, and will also help to reduce system noise.
A common blur provider is the Gaussian filter. It uses the Guassian function G(i,j) = exp (— ((i — u)**2 + (j-v)** 2)/(2* SIGMSQJ) where is a pixel in the filter ‘window’, (u,v) is the geometric centre ofthe window and SIGMSQ is set to 4. The system is set so that the sum of all the weights is 1.

Smoothing

Some smoothing is achieved with blur filters but it can also be achieved using a median filter, when pixels of interest are replaced using a weighted median value derived from its neighbors. It computes the median values usually within a 3 x 3 window of pixels surrounding the pixel of interest. When the median or middle value is applied, it has the effect of smoothing the image while preserving the key edge data.

Noise Reduction

By definition, noise is unwanted data. Noise reduction filters can be applied at any stage of the enhancement. This is particularly important as many enhancement techniques enhance the noise as well as the required signal. Noise reduction can be achieved in a number of ways.
Morphological functions contract or expand the edges and borders of uniform light or dark regions of a picture. The contraction function replaces each pixel of interest with its darkest neighbor. This causes light objects to contract minimally and considerably reduces light-colored signal noise. The expansion function replaces each pixel of interest with its brightest neighbor, causing dark objects to shrink minimally and dark-colored signal noise to be dramatically reduced.
An average filter, sometimes called a mean filter, is also used to smooth pictorial data to reduce noise. It carries out spatial filtration on each pixel, again typically using a 3 x 3 window. The filter calculates the total sum of the value pixels within the window and this total is divided by the number of pixels in the window. The values of edge pixels that are near to the borders of the image are replicated to provide enough data for filtration in the edge regions.
In specialized imagery, such as active microwave systems, gamma filters are used to remove high frequency noise, which is called radar ‘speckle’ (while preserving high frequency edges), or, in the case of infrared, special versions are used to reduce system saturation. Such filters can be used to reduce small-scale system saturation on passive infrared imagery. In essence, the gamma filter carries out spatial filtration on each individual pixel, using the gray level values in a square matrix surrounding each pixel. The matrix can be from a 3 x 3 format to an 11 x 11 format.

Warp, Rotation and Roaming

Using a technique called warp, an oblique image can be portrayed in a different perspective from the original perspective that was recorded by the camera. It does this while maintaining the original scale and aspect ratios. This is a complex operation which rotates the geographical coordinates of all points of the picture. It can be useful in determining the relationship of an object to its background or in superimposing one image upon another when the angular field of view of the second image is slightly different from that of the first.
Rotation algorithms allow the picture to be turned on its central axis. Thus, for example, an oblique photograph, taken at an angle to the horizontal and where the vertical objects (e.g. the side of a building) are portrayed at a slant, can be rectified.
Roaming allows the operator to roam around different parts of a large image and to follow a line of interest from one point to another.

Virtual reality modeling

The power of modern computers has enabled many complex operations. These in turn have provided an interesting form of digital imagery enhancement. On very poor forensic imagery, if the dimensions of specific objects in the scene have been derived from accurate on-site measurement, then a three-dimensional computer-generated model can be produced. This computer model can be superimposed upon the photograph to show more clearly what the photo-graph itself portrays. Moving objects, such as persons within the scene, can themselves become modeled and the whole scene can be rotated into any perspective required. The operation is complex and costly and there are those that are concerned about its safety as an evidential tool. However, providing safeguards such as watermarking the model, encryption of the data and audit trail of the process are put in place, fears should be allayed.

Stereoscopy

Stereoscopic imagery is an important form of enhancement. Apart from holographic images, imagery is a two-dimensional representation of a three-dimensional world. Imagery analysts, along with most humans, are used to viewing life in three dimensions. It follows, therefore, that if imagery can show the third dimension (depth), its contents are more easily identified and analyzed.
A three-dimensional image is obtained because we have two eyes, each looking at the same scene from a slightly different angle. The brain is able to fuse the two images and create a three-dimensional scene.
Most imagery taken for forensic work appears as a single image, either in photographic or in video form. But if a scene of crime operator were to take two pictures of the same scene or object, one after the other from slightly different viewpoints (in azimuth not range), then three-dimensional viewing is possible. The processed images are viewed through special glasses called stereoscopes. These allow the left eye to look only at the left picture and the right eye to look only at the right picture. The brain then carries out its normal fusion function and the three-dimensional scene is apparent.
Some work has been done using video to produce a stereoscopic effect. In this case the camera is stationary but the subject moves. Providing the movement is in the horizontal (azimuth) plane, then a three-dimensional effect is possible. Neighboring stills are used and observed through stereoscopes.
Unfortunately most closed-circuit television systems in static situations, e.g. in banks or city centers, are time-lapse systems. That is to say that the output of several cameras is recorded on to one tape and the tape is slowed to last for 12 or 24 hours. Therefore, often too few of the 50 fields per second produced by the camera are recorded and the movement of an object (for example a robber) between recorded frames is too great for the three-dimensional effect to be possible.
Stereoscopic imagery can be taken from a digital camera system or, if analog, can be scanned and digitized. Once in digital form, each frame can be placed electronically at the correct (customized) interocular distance for viewing.
An on-screen three-dimensional system uses two digital stereoscopic frames. These can be polarized: one horizontally and the other vertically. Each eye lens of viewing spectacles is polarized in correlation and hence, when the screen image (images) is viewed, the three-dimensional scene is apparent on the monitor.

Conclusion

Digital imagery enhancement is a vital tool in forensic imagery analysis. However, safeguards are important if the dramatic impact of imagery on crime investigation and the prosecution of criminals is not to be abused. Watermarking, encryption and audit trail are key to the elimination of public concern about digital imagery and digital imagery enhancement.
There are many filters to enable a wide variety of electronic enhancement, and more are being developed. Nevertheless, the key that unlocks the data in an image is the ability of the human brain to provide analysis based on observation and common sense. This ability can be enhanced by learning and experience.
Pictures can be misinterpreted and thus mislead. Trained and experienced imagery analysts are required if serious errors are to be avoided in the field of forensic imagery analysis. The eyes of even trained imagery analysts are vastly inferior in performance and sensitivity to the electronic sensors that image and record a scene. Digital enhancement of imagery to bring the imagery data within the range of human eye performance is therefore often a prerequisite to successful imagery analysis – where the brain functions in its full analytical mode.