Extracting Relations from Text: From Word Sequences to Dependency Paths - Natural Language Processing and Text Mining

Information Technology Reference

In-Depth Information

A recent approach to extracting relations is described in [17]. The authors use

a generalized version of the tree kernel from [18] to compute a kernel over rela-

tion examples, where a relation example consists of the smallest dependency tree

containing the two entities of the relation. Precision and recall values are reported

for the task of extracting the five top-level relations in the ACE corpus under two

different scenarios:

- [S1] This is the classic setting: one multi-class SVM is learned to discriminate

among the five top-level classes, plus one more class for the no-relation cases.

- [S2] One binary SVM is trained for relation detection , meaning that all positive

relation instances are combined into one class. The thresholded output of this binary

classifier is used as training data for a second multi-class SVM, trained for relation

classification .

The subsequence kernel (SSK) is trained under the first scenario, to recognize

the same five top-level relation types. While for protein interaction extraction only

the lexicalized version of the kernel was used, here we utilize more features, corre-

sponding to the following feature spaces: Σ 1 is the word vocabulary, Σ 2 is the set of

POS tags, Σ 3 is the set of generic POS tags, and Σ 4 contains the five entity types.

Chunking information is used as follows: all (sparse) subsequences are created ex-

clusively from the chunk heads, where a head is defined as the last word in a chunk.

The same criterion is used for computing the length of a subsequence - all words

other than head words are ignored. This is based on the observation that in general

words other than the chunk head do not contribute to establishing a relationship

between two entities outside of that chunk. One exception is when both entities in

the example sentence are contained in the same chunk. This happens very often due

to noun-noun ('U.S. troops') or adjective-noun ('Serbian general') compounds. In

these cases, the chunk is allowed to contribute both entity heads.

The shortest-path dependency kernel (SPK) is trained under both scenarios. The

dependencies are extracted using either Hockenmaier's CCG parser (SPK-CCG) [14],

or Collins' CFG parser (SPK-CFG) [16].

Table 3.2 summarizes the performance of the two relation kernels on the ACE

corpus. For comparison, we also show the results presented in [17] for their best

performing kernel K4 (a sum between a bag-of-words kernel and a tree dependency

kernel) under both scenarios.

Table 3.2. Extraction Performance on ACE.

(Scenario) Method Precision Recall F-measure

(S1) K4

70.3

26.3

38.0

(S1) SSK

73.9

35.2

47.7

(S1) SPK-CCG

67.5

37.2

48.0

(S1) SPK-CFG

71.1

39.2

50.5

(S2) K4

67.1

35.0

45.8

(S2) SPK-CCG

63.7

41.4

50.2

(S2) SPK-CFG

65.5

43.8

52.5

Natural Language Processing and Text Mining

Search WWH ::

Custom Search

Home