Database Reference
In-Depth Information
6.6.2 T URNING CONTEXT PROPLETS INTO LANGUAGE PROPLETS
proplet shell
language proplets
sur: α '+x
noun: α
cat: pn
sem: count pl
fnc:
mdr:
prn:
sur: dog+s
noun: dog
cat: pn
sem: count pl
fnc:
mdr:
prn:
sur: book+s
noun: book
cat: pn
sem: count pl
fnc:
mdr:
prn:
sur: child+ren
noun: child
cat: pn
sem: count pl
fnc:
mdr:
prn:
sur: apple+s
noun: apple
cat: pn
sem: count pl
fnc:
mdr:
prn:
Assuming that the context proplets in 6.6.1 have been acquired already, learn-
ing the associated language proplets involves only a single value, namely that
of the sur attribute, again facilitating learning.
Once the proplets have been acquired for one language, they may be reused
for other languages, provided the lexicalization is similar. 25 The following ex-
ample shows the proplets for the concept dog with English, French, German,
and Italian surfaces:
6.6.3 T AKING SUR VALUES FROM DIFFERENT LANGUAGES
proplet shell
language proplets
sur:
'
noun: α
cat: sn
sem: count sg
fnc:
mdr:
prn:
α
sur: dog
noun: dog
cat: sn
sem: count sg
fnc:
mdr:
prn:
sur: chien
noun: dog
cat: sn
sem: count sg
fnc:
mdr:
prn:
sur: Hund
noun: dog
cat: sn
sem: count sg
fnc:
mdr:
prn:
sur: cane
noun: dog
cat: sn
sem: count sg
fnc:
mdr:
prn:
For syntactic-semantic parsing, the French, German, and Italian proplet ver-
sions will have to be complemented with the additional cat value m (for the
grammatical gender masculine). This language-dependent information may
be obtained from the traditional dictionaries for these languages. In addition,
corpus-based information, such as domain-dependent frequency, LA-hear pre-
decessors and successors ordered according to frequency (n-grams), semantic
relations, etc., may be added to the owner proplets (Sect. 8.5). 26
25 Cf. 3.6.1; other examples of different lexicalizations are (i) German Traumreise (literally dream
journey ), which has been translated into American English as dream vacation and into French as
voyage des rêves , (ii) English horseshoe , which translates into German as Hufeisen (literally
hoof iron ) and into French as fer à cheval (literally iron for horse ), and (iii) French ralenti ,which
translates into English as slow motion and into German as Zeitlupe .
26 Automatic word form recognition (based on a lexicon and rules) provides a more accurate frequency
analysis of a corpus, for example, than part-of-speech tagging (based on statistical transition likeli-
hoods from one word form to the next in a corpus). Unlike automatic word form recognition, part-of-
speech tagging does not relate surface values such as learn, learns, learned, learning ,and swim,
swims, swam, swum, swimming , to their base forms (core values), i.e., learn and swim , respec-
tively. Therefore, the rule-based and the statistical approach lead to substantially different frequency
distribution results. For an evaluation of the CLAWS4 tagging analysis of the Britisch National Cor-
pus (BNC), see FoCL'99, Sect. 5.5.
Search WWH ::




Custom Search