Information Technology Reference
In-Depth Information
with their environment) and hypothesizes complete grammars instantaneously
(this assumption is unrealistic).
The Query learning model proposed by D. Angluin has also some controversial
aspects from a linguistic point of view; for example, the learner is able to ask
theteacherifhishypothesisiscorrect (such a query will never be produced in
a real situation; a child would never ask the adult if his grammar is the correct
one), and the learner learns exactly the target language (this is not realistic,
since everybody has imperfections in their linguistic competence).
In the PAC learning model proposed by L. Valiant, the examples provided to
the learner have the same distribution throughout the process; this requirement
is too strong for practical applications.
Therefore, none of these models perfectly account for natural language ac-
quisition. Research in GI has been focused on the mathematical aspects of the
formal models proposed, without exploiting their linguistic relevance. A longer
discussion about these models can be found in [5].
2.2 Language Learning Problem
The problem of language learning concerns both the acquisition of the syntax
(i.e., rules for generating and recognizing correct sentences in the language) and
the semantics (i.e., the underlying meaning of each sentence) of a target language
[26]. However, GI studies has been focused only on learning the syntax.
Semantics not only is one component of language learning, but also seems to
play an important role in the first stages of children's language acquisition (as
we will see in the next section). Therefore, it is also of great interest to study
this component. Unfortunately, all these considerations have not been taken into
account in GI studies; the learning problem has been reduced to syntax learning,
and all semantic information has been omitted from their works.
GI algorithms are based on the availability of different types of information:
positive examples, negative examples, the presence of a teacher able to answer
queries, etc. However, what kind of data is available to children? Ideally, to
better understand the process of natural language acquisition and to correctly
simulate it, we should provide to our algorithm the same kind of examples that
are available to children. However, some of the data used by GI algorithms are
controversial from a linguistic point of view. We will discuss some linguistic
studies that try to answer this question in the next section.
In order to make the problem of language learning well defined, it is also nec-
essary to choose an appropriate class of grammars. The classes of regular and
context-free grammars are often used in GI to model the target grammar. These
two classes constitutes the first two levels of the Chomsky hierarchy. Thus, the
following question arises: do they have enough expressive power to describe nat-
ural languages? From a linguistic point of view, it is of great interest to study
classes of grammars that are able to generate the most relevant constructions
that appear in natural languages. However, it seems not to be the case of regu-
lar and context-free grammars. We will discuss the limitations of the Chomsky
hierarchy in the next section.
 
Search WWH ::




Custom Search