Databases Reference
In-Depth Information
term in T 1 is also matched by a term in T 2 , and at least one literal is matched
by a term in T 2 and not by a term in T 1 .
We can trivially extend these “more specific than” relations to define “less
specific than” and “equally specific as” relations:
t 1 > s t 2
⇐⇒
t 2 < s t 1
t 1 s t 2
⇐⇒
t 2 s t 1
t 1 = s t 2
⇐⇒
t 1 s t 2 and t 1 s t 2
In this last case, t 1 = s t 2
⇐⇒
µ ( t 1 )= µ ( t 2 ). Note that this is a narrower
definition than requiring that
|
µ ( t 1 )
|
=
|
µ ( t 2 )
|
.
3.2 Ordering of Templates
Above, we have discussed ordering of terms and of template elements. Here we
generalise this to discuss entire templates. We want to be able to compare two
templates, τ 1 and τ 2 , and say which is more specific, i.e. which one matches
fewer fragments. If τ 1 and τ 2 are related through superset generalisation, then
for a given set of documents D , this is testing whether µ ( τ 1 ,D )
µ ( τ 2 ,D )
and therefore whether
|
µ ( τ 1 ,D )
|
>
|
µ ( τ 2 ,D )
|
, or vice versa, or they are equal.
E.g. τ 1 > s τ 2
µ ( τ 2 ,D ). This depends on the corpus D
and is potentially slow to evaluate, especially if D contains a large number of
documents. We would rather have an estimate of the relative specificity which
is independent of D , which will be useful when developing search heuristics.
Suppose τ 1 and τ 2 are almost identical, and differ only in one template
element:
⇐⇒
µ ( τ 1 ,D )
τ 1 = <T 1 ,T 2 ,...,T i ,...,T n >
τ 2 = <T 1 ,T 2 ,...,T i ,...,T n >
T i > s T i .
More generally, if τ 1 and τ 2 contain the same number of sets of terms, then
τ 1 s τ 2
= T i . Then we can say that τ 1 > s τ 2
where T i
⇐⇒
⇐⇒
T 1 ,i s T 2 ,i i =1 ...
|
τ 1 |
,and
and T 1 ,j > s T 2 ,j for some j.
In a slight extension to previous notation, we use T n,i to refer to the i th
element of template τ n .
Also, if two templates are identical except that one is “missing” the first
or last template element of the other, then the shorter of the two is less
specific. I.e. if τ 1 = <T 1 ,T 2 ,...,T n− 1 ,T n > , τ 2 = <T 1 ,T 2 ,...,T n− 1 > and
τ 3 = <T 2 ,...,T n− 1 ,T n > then τ 1 s τ 2and τ 1 s τ 3.
Although these relationships do not provide a complete ordering over all
templates, they do allow us to compare similar templates, and this is su cient
to allow us to create and modify templates, and to develop useful search
heuristics. We use this ordering to develop functions that create and modify
templates in Sect. 4, and to develop methods to search e ciently for good
templates in Sect. 6.
τ 1 > s τ 2
⇐⇒
T 1 ,i s T 2 ,i i =1 ...
|
τ 1 |
Search WWH ::




Custom Search