Image Processing Reference
In-Depth Information
might include object and texture classification or size distribution estimation. In
many cases, the 2D image is a single projection of the 3D world via unspecified mod-
els with unknown parameters. The additional problems of perspective, shadow, and
occlusion lead to further ambiguities that can only be resolved with the application
of experiential knowledge.
Visual perception is a complex task; it is not tolerant of the linear approxima-
tions that arise from frequency decomposition and the projection of signals into or-
thogonal subspaces. As a result of the perceptual importance of edges, the essential
components of images tend to occupy a wide range of the frequency domain. The
corrupting noise processes may well overlap the signal in such a way as to make lin-
ear separation impossible. It is also difficult to quantify image quality through sim-
ple measures such as mean-absolute error (MAE) and mean-square error (MSE).
For example, an image may be restored in such a way that it contains only a tiny
variation in MAE from some ideal original, but if the higher frequency components
are lost or there is significant phase distortion, it may look very poor to a human ob-
server. On the other hand, large variations in brightness and contrast (leading to
large error measures) may be tolerable provided that the edges are distinct.
Despite these points, linear image processing techniques have thrived because
of their mathematical elegance and their ability to describe continuous signals.
Also, the process of sampling such that continuous signals are represented only by
their values at discrete points may be completely described by linear mathematics.
Despite this, there are strong arguments for seeking solutions to image process-
ing problems in terms of logical mappings. Consider a linear “image-to-image”
processing task, which might include restoration, noise reduction, enhancement, or
shape recognition.
We begin with a signal that is sampled in three dimensions (two spatial and one
intensity). Let us assume that the image is 256 × 256 × 8 bits. Whatever processing
is to be carried out, the result will eventually be mapped back into the same discrete
signal space. The bits within the finite strings of the input image are interpreted as
part of an unsigned binary number in order to be given an arithmetic meaning. In
most linear operations, such as filtering, the unsigned integers will be converted to
real or complex numbers containing a mantissa and an exponent. In order to com-
pute the various linear multiply-accumulate transformations, these numbers are
then mapped into electronic circuits and viewed as finite-length binary strings. The
circuits operate at their most basic level by employing digital electronics to carry
out Boolean algebra on the binary strings to produce different binary strings.
The resulting binary values are then mapped back to real or complex numbers
that are eventually clipped and quantized into the 256 × 256 × 8 bit signal space that
forms the output image.
So even though we may have carried out a fundamentally linear operation such
as a Fourier or wavelet transform, it has been implemented as a series of logical oper-
ations. We have mapped the signal in terms of binary strings through digital logic to a
resulting set of binary strings. However, we have in effect imposed linearity con-
straints such that at every stage of processing the following two statements are true:
Search WWH ::




Custom Search