Information Technology Reference
In-Depth Information
3.4 Experimental Evaluation
The two relation kernels described above are evaluated on the task of extracting
relations from two corpora with different types of narrative, which are described in
more detail in the following sections. In both cases, we assume that the entities and
their labels are known. All preprocessing steps - sentence segmentation, tokeniza-
tion, POS tagging, and chunking - were performed using the OpenNLP 1 package.
If a sentence contains n entities ( n ≥ 2), it is replicated into n
2
sentences, each
containing only two entities. If the two entities are known to be in a relationship,
then the replicated sentence is added to the set of corresponding positive sentences,
otherwise it is added to the set of negative sentences. During testing, a sentence
having n entities ( n ≥ 2) is again replicated into n
2
sentences in a similar way.
The dependency graph that is input to the shortest path dependecy kernel is
obtained from two different parsers:
The CCG parser introduced in [14] 2 outputs a list of functor-argument depen-
dencies, from which head-modifier dependencies are obtained using a straight-
forward procedure (for more details, see [15]).
Head-modifier dependencies can be easily extracted from the full parse output
of Collins' CFG parser [16], in which every non-terminal node is annotated with
head information.
The relation kernels are used in conjunction with SVM learning in order to
find a decision hyperplane that best separates the positive examples from negative
examples. We modified the LibSVM 3 package by plugging in the kernels described
above. The factor λ in the subsequence kernel is set to 0 . 75. The performance is
measured using precision (percentage of correctly extracted relations out of the total
number of relations extracted), recall (percentage of correctly extracted relations
out of the total number of relations annotated in the corpus), and F-measure (the
harmonic mean of precision and recall ).
3.4.1 Interaction Extraction from AIMed
We did comparative experiments on the AIMed corpus, which has been previously
used for training the protein interaction extraction systems in [9]. It consists of 225
Medline abstracts, of which 200 are known to describe interactions between human
proteins, while the other 25 do not refer to any interaction. There are 4084 protein
references and around 1000 tagged interactions in this dataset.
The following systems are evaluated on the task of retrieving protein interactions
from AIMed (assuming gold standard proteins):
[Manual] : We report the performance of the rule-based system of [7, 8].
[ELCS] : We report the 10-fold cross-validated results from [9] as a Precision-
Recall (PR) graph.
[SSK] : The subseqeuence kernel is trained and tested on the same splits as
ELCS. In order to have a fair comparison with the other two systems, which use
only lexical information, we do not use any word classes here.
1 URL: http://opennlp.sourceforge.net
2 URL:http://www.ircs.upenn.edu/˜juliahr/Parser/
3 URL:http://www.csie.ntu.edu.tw/˜cjlin/libsvm/
Search WWH ::




Custom Search