Databases Reference
In-Depth Information
categories. In an implementation we may decide to use the set
{
FELINE,
CANINE
}
which is a more cumbersome, but more specific set of terms than
{
, depending on our requirements. The decision made regarding
which set to use in a template would depend on the application and on the
details of the heuristic searches, as we discuss in Sect. 6.
Note also that generalise will return an empty set if no generalisation
exists that matches all the required literals. Thus if the input set contains a
literal that is not contained in any member of κ x ,thengeneralise( T,κ x )=
ANIMAL
}
.
Example: generalise(
{
the
}
Γ )=
, because the literal “the” is not in
any gazetteer.
4.2 Creating and Modifying Entire Templates
In the previous section, we defined the creation and generalisation of template
elements. Templates are ordered lists of template elements (Definition 13), and
we now apply the above concepts to create and modify templates. Given a
seed phrase, in the form of a tuple of literals (i.e. a fragment), we can easily
define a very specialised template that matches only that fragment. We can
then modify this to increase its generality.
Definition 18. We extend the initialise function to create a specialised tem-
plate from a fragment.
initialise ( 1 2 ,...,λ n > )= <
{
λ 1 }
,
{
λ 2 }
,...,
{
λ n }
>.
This can also be written as: initialise ( f )= f, where f is a text fragment.
We define a new function that generalises any given template to create
a new set of templates by modifying a single element of the template using
the element generalisation function defined in Sect. 4.1. One template will be
created for each possible generalisation of the specified template element.
Definition 19. Given the template, τ = <T 1 ,T 2 ,...,T i ,...,T n >. Then
generalise ( τ,κ,i )= |T i ∈ generalise ( T i ) and
τ = <T 1 ,T 2 ,...,T i ,...,T n >
}
I.e. we replace the i th
template element with the result of its own
generalisation.
Example: Let τ 1 = < the, cat, sat > .Thengeneralise( τ 1 Γ , 2) =
{
< the,
FELINE, sat >,< the, ANIMAL, sat >
. In this case, generalise( τ 1 Γ , 2) re-
turns two templates because the second literal “cat” belongs to two gazetteers.
In contrast, generalise( τ 1 Γ , 1) =
}
, because the first literal “the” does not
belong to any gazetteer in the category κ Γ .
Search WWH ::




Custom Search