Information Technology Reference
In-Depth Information
An equivalent formulafor K n ( s, t, λ ) isobtainedbychanging theexponent of λ from
Equation3.2 to
j 1 + 2.
Based onall definitions above, K n iscomputedin O ( kn|s||t| ) time, by modi-
fying the recursivecomputationfrom [2]withthe newfactor c ( x, y ), asshown in
Figure 3.1. As in [2], thecomplexity of computing K i ( s, t ) is reducedto O ( |s||t| ) by
first evaluating another auxiliary factor K i ( s, t ). In Figure 3.1, thesequence sx is
the result of appending x to s (with ty definedinasimilar way). To avoid clutter, the
parameter λ is not shown in the argument list of K and K , unless it is instantiated
to a specific constant.
|
s
|
+
|
t
|−
i 1
K
0
( s, t )=1 ,foralls,t
K i ( s, t )=0 ,if min( |s|, |t| ) <i
K i ( s, ∅ )=0 ,foralli,s
K i ( sx, ty )= λK i ( sx, t ) + λ 2 K i− 1 ( s, t ) · c ( x, y )
K i ( sx, t )= λK i ( s, t ) + K i ( sx, t )
K n ( s, t )=0 ,if min( |s|, |t| ) <n
K n ( sx, t )= K n ( s, t ) +
j
λ 2 K n− 1 ( s, t [1 : j − 1]) · c ( x, t [ j ])
Fig. 3.1. Computation of subsequence kernel.
3.2.3 Computing the Relation Kernel
As describedatthe beginningofSection3.2, the input consistsofaset of sentences,
whereeach sentence containsexactlytwoentities (protein names in thecase of
interaction extraction). In Figure 3.2 weshowthesegments that will be usedfor
computing the relationkernel betweentwoexample sentences s and t .In sentence
s , forinstance, x 1 and x 2 are the twoentities, s f is thesentence segment before
x 1 , s b is thesegment between x 1 and x 2 , and s a is thesentence segment after x 2 .
For convenience, we also include the auxiliary segment s b = x 1 s b x 2 , whose span is
computedas l ( s b )= l ( s b ) + 2(in all length computations, weconsider x 1 and x 2 as
contributingone unit only).
The relationkernel computes the number of common patterns betweentwosen-
tences s and t , where theset of patterns is restrictedto the four types introduced
in Section3.2.1. Therefore, the kernel rK ( s, t ) isexpressedas thesum offour sub-
kernels: fbK ( s, t )counting the number of commonfore-between patterns, bK ( s, t )
forbetween patterns, baK ( s, t ) forbetween-after patterns, and mK ( s, t ) formod-
ifier patterns, as in Figure 3.3. Thesymbol 1 is usedthere as a shorthand forthe
indicator function, whichis 1ifthe argument is true, and 0otherwise.
Thefirstthree sub-kernels include in their computationthecountingof common
subsequences between s b and t b .In ordertospeedupthecomputation, all these
Search WWH ::




Custom Search