Database Reference
In-Depth Information
recognition 26 as part of the artificial auto-channel. The reason for a 20 year
stagnation in this area is a search space too large for a statistical approach as
used today. The gigantic size of the search space is caused by the large number
of possible word forms in a natural language multiplied with an even larger
number of possible syntactic combinations and variations of pronunciation
between different speakers.
The best way to reduce this search space is by hypotheses on possible con-
tinuations computed by a time-linear grammar algorithm as well as by expec-
tations based on experience and general world knowledge. After all, this is
also the method used by humans for disambiguating speech in noisy environ-
ments. 27 Using this method for artificial speech recognition requires a theory
of how communicating with natural language works.
2.5 Automatic Word Form Recognition
Assuming that the language input to an artificial cognitive agent is word form
surfaces provided by the service channel as sequences of letters, the first step
of any rule-based (i.e., nonstatistical) reconstruction of natural language un-
derstanding 28 is building a system of automatic word form recognition. This is
necessary because for the computer a word form surface like learns is merely
a sequence of letters coded in seven-bit ASCII, no different from the inverse
letter sequence snrael , for example.
Automatic word form recognition takes an unanalyzed surface, e.g., a letter
sequence like learns , as input and provides the computer with the informa-
tion needed for syntactic-semantic processing. For this, any system of auto-
matic word form recognition must provide (i) categorization and (ii) lemmati-
zation . Categorization specifies the grammatical properties, which in the case
of learns would be something like “verb, third person singular, present tense.”
Lemmatization specifies the base form, here learn , which is used to look
up the meaning common to all the word forms of the paradigm, i.e., learn,
learns, learned ,and learning . 29
26 Contextual cognition, such as nonlanguage vision and audio, may also benefit from the service chan-
nel. By building a context component with a data structure, an algorithm, and a database schema via
direct access, the robot's recognition and action components are provided with the structures to map
into and out of.
27 Juravsky and Martin (2009, pp. 16-17) sketch an approach which could be construed as being similar
to DBS. However, their system is not agent-oriented, nor surface compositional, nor time-linear.
28 We begin with the hear mode as a means to get content into the computational reconstruction of
central cognition. The availability of such content is a precondition for implementing the speak mode.
29 For further information on the morphological analysis of word forms and different methods of auto-
matic word form recognition, see FoCL'99, Chaps. 13-15.
Search WWH ::




Custom Search