Database Reference
In-Depth Information
Lidstone
1.E-16
1.E-13
1.E-10
1.E-07
1.E-04
1.E-01
-2000
-2500
-3000
-3500
-4000
-4500
FIGURE 10.19
: Log likelihood of validation data against the Lidstone
smoothing parameter
.
entity
g
living thing
entity
causal agent
g
person
living thing
a
scientist
causal agent
person
…dolphins and whales were studied by Cousteau…
a
scientist
?
…dolphins and whales were studied by Cousteau…
FIGURE 10.20
: Pre-generalization and post-filtering.
question “Which scientist studied dolphins,” we frame a query in our system
of the form
type=scientist#n#1 NEAR studied dolphins
.Hereishowwe
would execute the query.
1. Find the best (defined later) registered generalization
g
in the taxonomy.
In the running example, we may prefer
g
=
living thing#n#1
over
g
=
entity#n#1
because the former, being less general, is presumably
rarer in the corpus (but also see commments below).
2. Perform a proximity search using
g
and the selectors in the query, which
ensures recall, but generally lowers precision. Therefore, we must inflate
k
in the top-
k
search to some
k
>k
(more about this later).
3. Use a
forward index
, described in Section 10.4.2.1, to get the actual
instance token
i
of
g
in each high-scoring response. In our running
example, the qualifying snippet may bring forth two candidate tokens,
Cousteau
and
whales
, because both are instances of
living_thing#n#1
.
Search WWH ::
Custom Search