Perception and Attention - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

One of the most important contributions of this model

is the ability to understand the functional implications

of the observed neural response properties, and how

these contribute to a sensible computational algorithm

for doing object recognition. Thus, we can see how the

binding problem in object recognition can be averted

by developing representations with increasingly com-

plex featural encodings and increasing levels of spatial

invariance. Further, we can see that by developing com-

plex but still distributed (i.e., subobject level) featural

encodings of objects, the system can generalize the in-

variance transformation to novel objects.

One major unresolved issue has to do with the nature

of the complex representations that enable generaliza-

tion in the model. What should these representations

look like for actual objects, and can computational mod-

els such as this one provide some insight into this issue?

Obviously, the objects used in the current model are too

simple to tell us much about real objects. To address this

issue, the model would have to be made significantly

more complex, with a better approximation to actual vi-

sual feature encoding (e.g., more like the V1 receptive

field model), and it should be trained on a large range of

actual objects. This would require considerably faster

computational machinery and a large amount of mem-

ory to implement, and is thus unlikely to happen in the

near future.

As we mentioned earlier, Biederman (1987) has

made a proposal about a set of object components

(called geons ) that could in theory correspond to the

kinds of distributed featural representations that our

model developed in its V4 layer. Geons are relatively

simple geometrical shapes based on particularly infor-

mative features of objects that are likely to provide use-

ful disambiguating information over a wide range of

different viewpoints (so-called non-accidental proper-

ties; Lowe, 1987). Although we obviously find the gen-

eral idea of object features important, we are not con-

vinced that the brain uses geons. We are not aware of

any neural recording data that supports the geon model.

Furthermore, the available behavioral support mostly

just suggests that features like corners are more infor-

mative than the middle portions of contours (Bieder-

man & Cooper, 1991). This does not specifically sup-

port the geon model, as corners (and junctions more

generally) are likely to be important for just about any

model of object recognition. We also suspect that the

representations developed by neural learning mecha-

nisms would be considerably more complex and diffi-

cult to describe than geons, given the complex, high-

dimensional space of object features. Nevertheless, we

are optimistic that future models will be able to speak

to these issues more directly.

One objection that might be raised against our model

is that it builds the location invariance solution into the

network architecture by virtue of the spatially localized

receptive fields. The concern might be that this archi-

tectural solution would not generalize to other forms of

invariance (e.g., size or rotation). However, by demon-

strating the ability of the model to do size invariant ob-

ject recognition, we have shown that the architecture is

not doing all the work. Although the scale and feat-

ural simplicity of this model precludes the exploration

of rotational invariance (i.e., only 90-degree rotations

are possible, with the horizontal and vertical input fea-

tures, such rotations turn one object into another), we

do think that the same basic principles could produce at

least the somewhat limited amounts of rotational invari-

ance observed in neural recording studies. As we have

stated, the network achieves invariance by representing

conjunctions of features over limited ranges of transfor-

mation. Thus, V2 neurons could also encode conjunc-

tions of features over small angles of rotation, and V4

neurons could build on this to produce more complex

representations that are invariant over larger angles of

rotation, and so on.

Finally, it is important to emphasize the importance

of using both error-driven and Hebbian learning in this

model. Neither purely error-driven nor purely Hebbian

versions of this network were capable of learning suc-

cessfully (the purely error-driven did somewhat better

than the purely Hebbian, which essentially did not learn

at all). This further validates the analyses from chap-

ter 6 regarding the importance of Hebbian learning in

deep, multilayered networks such as this one. Error-

driven learning is essential for the network to form rep-

resentations that discriminate the different objects; oth-

erwise it gets confused by the extreme amount of feat-

ural overlap among the objects. Although it is possible

that in the much higher dimensional space of real ob-

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home