Image Representation and Manipulation - Computer Graphics: Principles and Practice

Graphics Reference

In-Depth Information

17.2 What Is an Image?

We'll start with a definition, which we'll later refine somewhat: An image is a

rectangular array of values, called pixel values, all of which have the same type.

These pixel values may be real numbers representing levels of gray (a grayscale

image), or they may be triples of numbers representing mixtures of red, green, and

blue (an RGB image ), 1 or they may contain, at each pixel, other information in

addition to color or grayscale data; a rich example is so-called z -data, indicating

at each pixel the distance from the viewpoint from which the image was captured

or produced.

A rectangular array of numbers can be interpreted in many ways. For instance,

it's possible to display a z -data or depth image in grayscale, in which case the

parts of the image that are near the viewer are displayed in lighter shades of gray

than the parts that are far away. A priori, the numbers in the array have no par-

ticular significance. But for practical matters, when we take a digital photograph

we'd like to know whether the pixels store red-green-blue triples or green-blue-red

triples, since any confusion could cause very peculiar pictures to be displayed or

printed. Thus, image data is typically stored in certain standard file formats, where

the meaning of the data associated to each pixel is standardized. Some formats,

notably TIFF ( Tagged Image File Format ), allow you to associate a description

to each datum. For instance, the description of a TIFF file might be “Each pixel has

five values associated to it: a red, green, and blue value represented by an integer

ranging from 0 to 255, a z -value represented by an IEEE floating-point number,

and an object identifier represented by a 16-bit unsigned integer.” With this in

mind, we begin our discussion of images with the mundane and practical issue of

how conventional file formats store and represent rectangular arrays of data.

How these rectangular arrays of values actually represent light intensities (or

other physical phenomena) and how well they do so is also important. Following

our discussion of image file formats, we move on to discuss the content of images.

17.2.1 The Information Stored in an Image

When we have a typical image file format, storing an n

k array of grayscale

values or RGB triples, it's natural to think about operations like adding together

two images, pixel by pixel (or averaging them, pixel by pixel), to create effects

like a cross-fade. To do such things requires a notion of addition of images and of

multiplication by constants, which we take from the operations on each pixel (i.e.,

to add two images, we add corresponding pixel values). For grayscale images,

what we have, in effect, is a correspondence between the set of n

×

k images and

the elements of R nk , given by enumerating the pixel values in some fixed order.

Thus, the set of all images forms a subset of an nk -dimensional space.

×

Inline Exercise 17.1: Each element of the standard basis for R nk consists of

nk

−

1 zeroes and a single one. What does the corresponding image look like?

Can you see how you could represent every image as a sum of scalar multiples

of such “basis images”?

1. The precise meanings of the red, green, and blue values may be quite vague; we'll

discuss this thoroughly later.

Computer Graphics: Principles and Practice

Search WWH ::

Custom Search

Home