Biomedical Engineering Reference
In-Depth Information
process will work. Thus, segmentation might work even better if we had 625 primary visual
lexicons (a 25 25 array). The use of more ''complex'' features, based upon more localized
''simple'' features (see Appendix), and the precedence principle is one design approach to achiev-
ing some of the benefits of more, smaller, lexicons without actually building them.
Multiple natural questions arise at this point. First, how well does this design actually work in
practice? In other words, how thoroughly does this segmentation process null lexicons coding other
objects and how reliably are the lexicons that code the attended object retained? The short answer is
that I don't know. The only evidence I have is based upon experiments done in my lab with a very
simple image environment (images of capital Latin alphabetical characters moving about, with
slowly randomly changing spatial and angular velocities, on a plane). In this case, a segmentation
scheme of this basic type worked very well.
In reality, probably not all fixation point objects will segment cleanly. Sometimes irrelevant
lexicons will not be nulled, and relevant lexicons will be. However, because of the nature of
development process for the secondary and tertiary visual lexicon symbol sets, which is described
next, such errors will not matter; as long as these lapses occur randomly and as long as the general
quality of the segmentation is fairly good. We will proceed with the assumption that these
conditions are satisfied.
The goal for the secondary lexicon symbols is twofold. First, each such symbol should be
somewhat pose insensitive (i.e., if it responds strongly to an object at one pose it will respond
strongly to the same object at nearby poses). Also, each secondary lexicon symbol should represent
a larger ''chunk'' of an object than any primary symbol. Such symbols are said to be more holistic
than primary lexicon symbols. Tertiary layer lexicon symbols are to be even more holistic than
secondary layer symbols.
For secondary and tertiary layer development, sequences of camera images containing the same
(operationally relevant) visual object are used. At the beginning of each sequence, we assume that
the gaze director has selected a fixation point on the object. In the subsequent frames of the
sequence, we check to see that in each, one point near the initial fixation point is also given a
high score by the gaze director. If this is true for a significant sequence of frames (say, 10 to 20 or
more), then these nearby points on the subsequent frames are designated as the fixation points for
those frames and this sequence of eyeball images is added to our training set for layers two and
three. It is assumed that this set of sequences provides good statistical coverage of the set of all
operationally relevant objects, and that each object is seen in many different, operationally
characteristic poses in the sequences. It is also assumed that the poses of the fixated object in
each sequence are dynamically changing. (Note: This dynamic variation in pose is needed for
training, but is not a requirement for operational use; where objects can be stationary, and yet can
still usually be described with a single look.)
As shown in Figure 3.9, the secondary layer lexicons receive knowledge links from primary
layer lexicons. The arrangement of these links is that a secondary lexicon symbol can only receive a
link from symbols lying on primary lexicons surrounding the position of the secondary lexicon in
the second layer lexicon array (i.e., like the primary lexicon array, the secondary array is envisioned
as also representing, with a regular ''tiling,'' the eyeball image content of the attended object, but
with each secondary lexicon representing a larger ''chunk'' of this object than a primary layer
lexicon — since the secondary layer has fewer lexicons than the primary layer). These knowledge
links connect every symbol belonging to each primary lexicon within the ''field of view'' of a
secondary lexicon to every symbol of that secondary lexicon. For each such forward knowledge
link, a link between the same two symbols in the reverse (secondary to primary) direction is also
created. All of these links start out with zero strength.
As mentioned in the Appendix, not all knowledge bases need to have graded p(c j l) strengths.
For many purposes in cognition, it is sufficient for knowledge links to simply be present (essentially
with strength 1) or absent (with effective ''strength'' 0). These inter-visual-level knowledge links
are of this binary character.
Search WWH ::




Custom Search