Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles - Natural Language Processing and Text Mining

Information Technology Reference

In-Depth Information

...

<t lemma="Spannung-Steuerung" word="Spannungssteuerung" pos="NN"

id="sentences._108_28" />

<t lemma="in" word="im" pos="APPRART"

id="sentences._108_29" />

<t lemma="Wickel-Kopf-Bereich" word="Wickelkopfbereich" pos="NN"

id="sentences._108_30" />

</terminals>

</nt>

...

Fig. 4.11. XML representation of a portion of the parse tree from Figure 4.10.

Phrase type NN

Grammatical function NK

Termi na l (is the constituent a terminal or non-terminal node?) 1

Path (path from the target verb to the constituent, denoting u(up) and d(down) for the direction)

uSdPPd

Grammatical path (like Path, but instead of node labels, branch labels are considered) uHDdMOdNK

Path length (number of branches from target to constituent) 3

Partial path (path to the lowest common ancestor between target and constituent) uPPuS

Relative Position (position of the constituent relative to the target) left

Parent phrase type (phrase type of the parent node of the constituent) PP

Target (lemma of the target word) hindeuten

Target POS (part-of-speech of the target) VVFIN

Passive (is the target verb passive or active?) 0

Preposition (the preposition if the constituent is a PP) none

Head Word (for rules on head words refer to [5]) Spannung-Steuerung

Left sibling phrase type ADJA

Left sibling lemma kontinuierlich

Right sibling phrase type none

Right sibling lemma none

Firstword, Firstword POS, Lastword, Lastword POS (in this case, the constituent has only one word,

thus, these features get the same values: Spannung-Steuerung and NN. For non-terminal constituents

like PP or NP, first word and last word will be different.)

Frame (the frame evoked by the target verb) Evidence

Role (this is the class label that the classifier will learn to predict. It will be one of the roles related

to the frame or none, for an example refer to Figure 4.12.) none

If a sentence has several clauses where each verb evokes a frame, the feature

vectors are calculated for each evoked frame separately and all the vectors participate

in the learning.

4.4.6 Annotation

To perform the manual annotation, we used the Salsa annotation tool (publicly

available) [11]. The Salsa annotation tool reads the XML representation of a parse

tree and displays it as shown in Figure 4.12. The user has the opportunity to add

frames and roles as well as to attach them to a desired target verb. In the example of

Figure 4.12 (the same sentence of Figure 4.10), the target verb hindeuten (point to)

evokes the frame Evidence, and three of its roles have been assigned to constituents of

the tree. Such an assignment can be easily performed using point-and-click. After this

Natural Language Processing and Text Mining

Search WWH ::

Custom Search

Home