Databases Reference
In-Depth Information
D
+
D
N
{
the cat sat
} {
the mouse ran
}
{
the lion roared
}{
a cow jumped
}
{
a tiger slept
}
{
the cat plays
}
Fig. 4.
Part of a search graph for a three-term template. Each rectangle shows
a template, starting with three sets of literals at the top initialised from the frag-
ment “the cat sat”. Various generalisation functions are then applied to it. For
example, gen(
Γ,
2) means apply the generalise(
τ, κ
Γ
,
2) function to the upper tem-
plate,
τ
, to produce the lower template(s). The graph includes part-of-speech labels
“VBD” meaning past-tense verb and “DT” meaning determiner. The numbers in
each box below the template represent the estimated numbers of true-positive and
false-positive matches for each template, with respect to the positive and neutral
document sets
D
+
and
D
N
shown. Note that not every node or edge is shown
As a more concrete example, Fig. 4 shows part of a search graph containing
various templates created from the seed fragment “the cat sat”, and evaluated
with respect to the two small corpora shown.
Consider the right-hand portion of the graph. Near the top are two
templates:
<
the,
>
, both created using the
generalise(
τ,κ
Ω
,x
) function. The first matches one true positive and no
false positives (shown as 1
+
0
N
∗
,sat
>
and
<
the, cat,
∗
in the figure). The second matches one true