Information Technology Reference
In-Depth Information
Figure 6.1.
A mystery picture
Much of what happens during vision does not seem to depend on what we know
about the world, and so by the definition given previously, does not involve thinking.
Our eyes and brains have evolved to be able to very quickly group together regions of
pixels in an image that have similar features (like brightness, color, or texture) and to
locate discontinuities in these features that are potential edges or contours of objects.
But not all of vision is like this. Consider the picture in figure 6.1, courtesy of
Antonio Torralba. Part of it has been blacked out and only an out-of-focus blob at the
bottom right is visible. Even if we stare at this blob long and hard, we cannot readily
identify what it is supposed to be. However, in the right-hand picture of figure 6.2, it
is quite easily identifiable as a car.
Why do we see the blob as a car, and not as an eagle or a biscuit or a person's face?
The rest of the picture by itself does not say “car” any more than it says “biscuit.”
But despite the fact that the picture is out of focus, there is enough information in it
for us to identify houses, sky, and road. And once we have made that identification,
we can use what we know to constrain the possible interpretations of the blob: cars are
physical objects of this size and shape on a road beside houses. This is thinking!
This chapter is concerned with using what is known in this way to interpret images.
It does not deal with arrays of pixels directly but only with a list of image components ,
such as the regions or edges that appear in an image. It does not work from precise
numeric values for these regions or edges but only from qualitative descriptions of
their properties, such as the relative sizes of the regions or which edges meet at
which vertices. Then, based on knowledge about how an actual scene can appear in
Search WWH ::




Custom Search