Mechanization of Cognition - Biomimetics: Biologically Inspired Technologies

Biomedical Engineering Reference

In-Depth Information

As each S vector arrives at the architecture of Figure 3.7, it is sent to the proper lexicon in

sequence. For simplicity, let us assume that the first S vector associated with the initial sound

content of the next word is sent to the first primary sound lexicon (if it goes to the ''wrong'' lexicon

or is missed altogether, it does not matter much — as will be explained below). Given that the first

primary sound lexicon has an expectation, and that the only symbols in this expectation are those

that represent sounds that a speaker of this type would issue (we each have hundreds of ''canonical

models'' of speakers having different accents and vocal apparati, and most of us add to this store

throughout life) when speaking early parts of one of the words we are expecting. Again note that,

because of the orthogonalized nature of the S vector and the pure-signal nature of the primary

feature symbols, each of the symbols in this expectation will typically represent sounds having only

a tiny number of S vector components that are nonzero. Each symbol in a primary sound lexicon is

expressed as a unit vector having these small number of components with coefficients near 1, and

all other components at zero. The lexicon takes the inner product of each symbol's vector

expression with S and this is then used as that symbol's initial input excitation (this is how symbols

get excited by sensory input signals; in contrast to how symbols get excited by knowledge

links from other symbols, which was discussed in Section 3.1). We have now completed the

transition from acoustic space to symbol space.

Notice that the issue of signal level of the attended source has not been discussed. As described

in Section 3.3.1, each S vector component has its amplitude expressed on a logarithmic scale

(based on ''sound power amplitudes'' ranging across many orders of magnitude). Thus, on this

scale, the inner product of S with a particular symbol's unit vector will still (because of the linear

nature of the inner product) be substantial, even if the attended source sounds are tens of dB below

those of some individual interferers. Thus, with this design, attending to weak, but distinct, sources

is generally possible. These are, of course, the characteristics we as humans experience in our own

hearing. Further, in auditory neuroscience, such logarithmic coding of sound feature response

signals (in particular, those from the brainstem auditory nuclei to the medial geniculate nucleus,

which are the auditory signals analogous to the components of S) is well established (Oertel

et al., 2002).

During the entire time of the word detection processes, all of the lexicons of the Figure 3.7

architecture are operated in a consensus building mode. Thus, as soon as the S-input excitations are

established on the expectation element symbols of the first primary sound lexicon, only those

symbols which received these expectations remain in the expectation (the consensus building is run

faster on the primary sound lexicons, somewhat slower on the sound phrase lexicons, and even

slower on the next-word acoustic lexicon). This process of expectation refinement that occurs

during consensus building is termed honing .

After acoustic input has arrived at each subsequent primary sound lexicon (the pace of

the switching is set by a separate part of the auditory system, which will not be discussed further

here, which synchronizes the pace of S vector formation — no it is not always exactly every 10 ms

— to the recent pace of speech production of the attended speaker), that lexicon's expectation

is thereby honed and this revised expectation is then automatically transferred to all of the

sound phrase regions that are not on its right (during consensus building, all of the involved

knowledge bases remain operational). This has the effect of honing some of the sound phrase

lexicon expectations, which then are transferred to the next-word acoustic lexicon; honing its

expectation.

This process works in reverse also. As higher-level lexicon expectations are honed, these are

transferred to lower levels, thereby refining those lower-level expectations. Note that if occasional

erroneous symbols are transferred up to the sound phrase lexicons, or even from the phrase lexicons

to the next-word acoustic lexicon, this will not have much effect. That is because the process of

consensus building effectively ''integrates'' the impact of all of the incoming transfers on the

symbols of the original expectation. Only when a phrase region has honed its symbol list down to

Biomimetics: Biologically Inspired Technologies

Search WWH ::

Custom Search

Home