Biomedical Engineering Reference
In-Depth Information
6.9.5 Spectral Maxima Strategies
Some other strategies have also produced good results. These are different from the CIS
method as the channel outputs are scanned and the n channels with the largest envelope
signals are delivered to a subset of the m available electrodes. They are known as m -of- n ,
advanced combination encoder (ACE), or SPEAK strategies. This peak-picking process is
designed to reduce the density of simulation while still representing the important aspects
of the acoustic signal. In addition, it is believed that this strategy reduces background noise
while maintaining speech levels and therefore improves the overall signal-to-noise ratio
of the perceived signal.
In the Nucleus-24 device, the m -of- n strategy selects 10 to 12 maximum amplitudes
of a total of 20 channels. In the ACE (and SPEAK) strategies, a threshold determines
how many channels are stimulated. This is typically between 5 and 10 depending on
the spectral content of the input signal. Tests have shown that even selecting as few as
three channels still achieves a 90% correct level of speech understanding for all stimulus
material including sentences, vowels, and consonants. In contrast, the classical CIS process
required eight channels to achieve the same level of understanding for consonants and four
channels for normal speech (Møller, 2006).
6.9.6 Strategies to Enhance Vocal Pitch
The complexity of the sensations evoked by even the most localized stimulation of the
basilar membrane has surprised and intrigued many investigators. The place-pitch theory
in its simplest form predicts that local activation of an area should elicit a fairly pure
tone corresponding to the local resonant frequency of the basilar membrane. Changes
in frequency of the stimulus should result only in changes in perceived amplitude as
they affect the average rate of firing of the local neurons. However, in practice spectrally
complex sensations such as buzzes and clangs arise, and researchers are interested in using
this information to improve understanding.
The strategies described in the previous section are designed to convey speech infor-
mation but fail to convey vocal pitch (F0) information effectively. This means that speakers
of tonal languages such as Cantonese and Mandarin find the processing inadequate.
Pitch information can be conveyed by both temporal and spectral (place) cues. Tem-
poral cues are present in the envelope modulations of the filtered waveforms, whereas
pitch can be elicited by varying the stimulation rate of the pulse train output on a single
electrode. However, once the rate exceeds about 300 Hz patients are unable to use this
information directly as it has exceeded the maximum firing rate of the stimulated neurons.
As part of a project to improve music appreciation among cochlear implant users,
research was undertaken to determine the relationship between rate and place for pitch
perception (Fearn, Carter et al., 1999). Their results, reproduced in Figure 6-40, showed
that pitch was strongly dependent on both rate and place at low frequencies but that at
rates above a few hundred pps stimulation rate had very little effect and pitch perception
was determined solely by the distance of the electrode from the round window.
Little progress has been made in improving music perception since then, so researchers
at the Bionic Ear Institute in Melbourne have started working with musicians to compose
pieces specifically for cochlear implant users (Hagan, 2010).
It has long been known that auditory nerve activity transmitted to the brain contains
detailed information about the exact phase of the motion of the basilar membrane. For
Search WWH ::




Custom Search