Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles - Natural Language Processing and Text Mining

Information Technology Reference

In-Depth Information

S

PP

Fig. 4.15. Parse tree of the sentences in Figure 4.14.

VAFIN

NP

VP

Step e) : The committee of classifiers consists of a maximum entropy (MaxEnt)

classifier from Mallet [19], a Winnow classifier from SNoW [2], and a memory-based

learner (MBL) from TiMBL [6]. For the MBL, we selected k=5 as the number of the

nearest neighbours. The classification is performed as follows: if at least two classi-

fiers agree on a label, the label is accepted. If there is disagreement, the cluster of

labels from the five nearest neighbours is examined. If the cluster is not homogenous

(i.e., it contains different labels), the instance is included in the set of instances to

be presented to the user for manual labeling.

Step f) : If one selects new sentences for manual annotation only based on the

output of the committee-based classifier, the risk of selecting outlier sentences is

high [29]. Thus, from the instances' set created by the classifier, we select those

belonging to large clusters not manually labeled yet.

4.5 Evaluations

To evaluate this active learning approach on the task of annotating text with knowl-

edge roles, we performed a series of experiments that are described in the following.

It was explained in Section 4.4.1 that, based on the XML structure of the docu-

ments, we created subcorpora with text belonging to different types of diagnostic

tests. After such subcorpora have been processed to create sentences, only unique

sentences are retained for further processing (repetitive, standard sentences do not

bring any new information, they only disturb the learning and therefore are dis-

carded). Then, lists of verbs were created, and by consulting the sources mentioned

in Section 4.3.3, verbs were grouped with one of the frames: Observation, Evidence,

Activity, and Change. Other verbs that did not belong to any of these frames were

not considered for role labeling.

4.5.1 Learning Performance on the Benchmark Datasets

With the aim of exploring the corpus to identify roles for the frames and by using

our learning framework, we annotated two different subcorpora and then manually

controlled them, to create benchmark datasets for evaluation. Some statistics for

the manually annotated subcorpora are summarized in Table 4.4. Then, to evaluate

the e ciency of the classification, we performed 10-fold cross-validations on each

set, obtaining the results shown in Table 4.5, where recall, precision, and the F β =1

measure are the standard metrics of information retrieval.

We analyzed some of the classification errors and found that they were due to

parsing anomalies, which had forced us in several occasions to split a role among

several constituents.

Search WWH ::

Custom Search

Home