Information Technology Reference
In-Depth Information
begin (for language) with the level of phonemes. Our claim is that phonemes are
remembered as a compact categorical representation whose symbols correspond
with regions in a multi-dimensional continuous space [ 17 ], according with our prin-
ciple of information-theoretic efficiency. Conversely, the detail of a particular speech
signal is not remembered, once it has passed beyond echoic memory.
The aim of the approach, then, is to perform human-like learning, beginning (in
language) from the phonemes, and passing through the Global Workspace cycle
described above. Initially, while memory is empty, the model is not able to pre-
dict, everything has high information content, and so individual symbols, and then
pairs begin to appear in the memory. Earlier work with IDyOM, IDyOT's ancestor,
demonstrated that such chunking can successfully be performed statistically over a
fixed corpus, with a descriptive model of segmentation [ 54 ]. In the more dynamic
IDyOT, in spite of the initial chunks being somewhat chaotic, they are held together
in larger sequences, because each chunk is given its own label, and the labels form
their own sequence in memory. Thus, when a symbol is encountered for the second
time, IDyOT is able to predict not only from the phoneme level, but also from the
sequences of chunks that it has constructed, and so hone its predictions. Because of
the positive feedback induced by the IDyOT processing loop, we expect the memory
to stabilise, 8 and once it has stabilised, the erroneous details that it inferred early in
the learning process, and therefore in the absence of a model, will fade into statistical
obscurity. Thus, we claim, IDyOT constitutes an explanatory model of the chunking
behaviour that the IDyOM segmentation described.
Theoretically, this layering of models can proceed up to a level beyond that of
sentences, but it is not restricted to the sentential forms of Chomskian linguistics: the
grouping is motivated by statistical structure, and not by semantic connection. The
approach gives rise to predictive behaviours very like Cohort Theory [ 28 ], where
words are identified incrementally as the phonemes appear. IDyOT assembles a
network of probabilistic predictions, as shown in Fig. 7.2 a.
Many readers will recognise this as (like) a Bayesian Network, and this accords
to evidence that such networks can predict aspects of human parsing behaviour [ 32 ].
It is important to note that IDyOT's network is in a sense stratified : there are distinct
layers, those layers predict only above and below, and only one symbol forwards
on each layer, which allows us in principle to control the computational cost of
prediction, though the problem of prediction from Bayesian networks in general
remains NP-hard. 9
However, in general, the stimuli to which IDyOT will respond will not be
sequences of atomic percepts. Speech, our illustration here, includes pitch, stress and
volume information, all of which will be encoded in memory as structured, multi-
dimensional symbols, and used for prediction, as has been demonstrated in IDyOM
[ 39 ]. This demands a more powerful model than is common in cognitive models
of language. For music, multi-dimensionality is a sine qua non , and for this reason,
Conklin andWitten [ 8 ] proposed an approach based on viewpoints that allows a set of
8 It will stabilise when it has produced an efficient model of the data that it is being exposed to.
9 That is to say, it cannot be computed in polynomial time by a von Neumann machine.
Search WWH ::




Custom Search