Information Technology Reference
In-Depth Information
is a chance that, from the moment when morphological rules are correctly
defined, there is no learning to be done on this aspect again. Beyond the
specific case of online learning of linguistic abilities, most of the systems in
question learn implicitly. For example, if a user shows negative behavior
translating his displeasure with the machine, it should remember the
conditions which triggered this behavior to avoid replicating it. As negative
behavior happens more often in similar situations, the machine will confirm
its caution with regard to those conditions. This is the principle of
reinforcement learning, a type of learning based on experience, and is one of
the challenges currently explored by MMD: the learning abilities allow the
machine to avoid difficulties such as anticipation and the programming of
every possible situation. By following this principle to the extreme, we reach
the principle of supervised learning, that is controlled by an operator, which
answers the questions being asked by the system and allows it to learn
reliably. Such a procedure happens before the real use of the system, during
the design phase. This is prior learning, which can be carried out following a
similar dialogue to that planned for the user, or more directly, by the operator
intervening in each module in question. The data can then be provided step by
step, as in a dialogue, or all at once, by using a learning corpus, in which case
we talk of batch learning. In any case, the goal is to train the system to refine
its behavior, or simply to optimize its performances (without learning
anything new), before it is made public.
From a technical point of view, machine learning can take various shapes
depending on the type of data and algorithms in question, see Chapter 6 of
[GAR 11]. A historical distinction separates symbolic learning stemming
from AI, which is limited to symbolic or at least discretized data, and
numerical learning, linked to statistics. In the 1990s, with the rise of the
digital age that we mentioned in section 1.1.2, numerical learning takes
precedence over symbolic learning by using the strength of calculation and
great size of the corpora. Today, as I. Tellier writes it in the preface to
[TEL 09], the work consists of formulating the problem that requires learning
first, either through categorizing, or labeling (or annotating), and then
choosing the suitable technique, knowing that the support vector machines
(SVMs) are performing in categorizing and that the conditional random fields
(CRFs), with the hidden Markov models (HMMs) are performing for
labeling.
At the MMD level, we find all the approaches mentioned, and in a variety
of modules. Some aspects, such as automatic speech recognition, have long
Search WWH ::




Custom Search