Information Technology Reference
In-Depth Information
response to phonemes. This is what accounts for humans' ability to hear their own
language pronounced in a previously-unencountered accent, and still to understand
it. Underlying the symbolic representation of phonemes as shown in Figs. 7.2 and 7.3
is a conceptual space akin to the vowel space of Fairbanks and Grubb [ 12 ]. Regions
in this space correspond with the sonic components of language as learned by the
system, so a word represented as a sequence of phonemes, as in the figures, is in
bijection with a trajectory through this space. In humans, physical constraints on
the vocal apparatus restrict how closely pronunciation can follow such a theoretical
trajectory.
An important aspect of human speech understanding is its robustness to mispro-
nunciation and to accents. IDyOT addresses this ability in its matching stage, where
predictions are matched against perceived input. The match is statistical, based on the
statistical distribution over the possible symbols, but also geometrical, based on the
conceptual space. This approach is important not just because it affords robustness
in understanding natural language, given inter-individual variations, but because it
also allows flexibility of understanding, in a human-like way, based on statistical
priming.
A well-known trap set by linguists, at the phonetic level, is the sentence
It's easy to wreck a nice beach.
which is readily mistaken, given an appropriate context, for the sentence
It's easy to recognise speech.
By choosing a pronunciation that is somewhat loose, and coupling it with appropriate
verbal or visual priming, this serves as a useful demonstration to students of howmuch
prior information is used in understanding language. In IPA, these two sentences are
respectively denoted
In fact, in common parlance, the /g/ is usually soft, and often omitted altogether;
so then the difference comes down only to the amount of voice in /z s/ , which are
elided to /zs/ ,against /s/ , and /p/ against /b/ respectively. In Fig. 7.4 , we illustrate
this process.
For the purposes of the current paper, this demonstration makes two points: first,
the multi-layered approach of IDyOT affords human-like behaviours as emergent
properties, without extra mechanism; and second, the conceptual layer, which affords
the flexibility of approximate matching in a principled way, is as important as the
symbolic one in driving the system. One might ask, therefore, why not conflate the
two and do all the inference in a continuous probability space? From our perspective,
the answer is methodological: we believe that the neural representation that we are
modelling is indeed on one high-dimensional level. However, current technology is
not adequate to model that representation, so working at multiple levels, in parallel,
gives us a way of working towards a solution, and, crucially, identifying where in the
structured representation and its associated operations any shortcomings are located.
Search WWH ::




Custom Search