Information Technology Reference
In-Depth Information
related to a knowledge task (like diagnosis) is represented by natural language, it
is reasonable to expect that some knowledge roles will map to some semantic roles.
The question is how to find these mappings, and more importantly, how to label
text with these roles?
A knowledge task like diagnosis or monitoring is not equivalent to a semantic
frame. The former are more complex and abstract, and can usually be divided into
several components, which in turn can be regarded equivalent to semantic frames.
By analyzing the textual episodes of diagnostic evaluations, we noticed that they
typically contain a list of observations, explanations based on evidence, and sug-
gestions to perform some activities. Thus, we consulted FrameNet for frames like
Observation, Change, Evidence, or Activity. Indeed, these frames are all present in
FrameNet. For example, Activity is present in 10 subframes, and different meanings
of Change are captured in 21 frames. The frame Evidence was shown in Figure 4.4,
and besides the two roles of Proposition and Support, it has also roles for Degree,
Depictive, Domain of Relevance, Manner, Means, and Result. When one carefully
reads the definition of the roles Proposition and Support and looks at the examples
(Figure 4.4), one can conclude that Proposition is similar to Cause and Support to
Symptom in a diagnosis task.
The problem is to determine which frames to look for, given that there are
currently more than six hundred frames in FrameNet. The key are the lexical units
related to each frame, usually verbs. Starting with the verbs, one gets to the frames
and then to the associated roles. This is also the approach we follow. We initially
look for the most frequent verbs in our corpus, and by consulting several sources
(since the verbs are in German), such as [15], VerbNet, 3 and FrameNet, we connect
every verb with a frame, and try to map between semantic roles in a frame and
knowledge roles we are interested in. One could also use the roles of FrameNet, but
they are linguistically biased, and as such are not understandable by domain users
that will annotate training instances for learning (a domain user would directly know
to annotate Cause , but finds Proposition somehow confusing.)
In this work, FrameNet was only used as a lexical resource for consultation, that
is, to find out which frames are evoked by certain lexical units, and what the related
semantic roles are. Since the language of our corpus is German, we cannot make any
statements about how useful the FrameNet frames could be to a learning system
based on English annotated data corresponding to the defined frames.
Finally, it should be discussed why such an approach to annotating text cases
with frames and roles could be beneficial to text mining. For the purpose of this
discussion, consider some facts from the introduced domain corpus. During the eval-
uation of the learning approach, we manually annotated a subcorpus of unique sen-
tences describing one specific measurement (high-voltage isolation current). In the
585 annotated sentences, the frame Evidence was found 152 times, 84 times evoked
by the verb zuruckfuhren (trace back to), 40 times by the verb hindeuten (point to),
and 28 times by 9 other verbs. Analyzing the text annotated with the role Cause in
the sentences with zuruckfuhren , 27 different phrases expressing causes of anomalies
pointed to by the symptoms were found. A few of these expressions appeared fre-
quently, some of them occasionally, some others rarely. In Table 4.1, some of these
expressions are shown.
3 http://www.cis.upenn.edu/˜bsnyder3/cgi-bin/search.cgi
Search WWH ::




Custom Search