Information Technology Reference
In-Depth Information
Tabl e 1.
Categorizing stems into projection and/or selection oriented sets
(1)
root
(ROOT, are)
,
(2)
nsubj
(are, capital)
,
(3)
prep of
(capital, state)
,
(4)
nsubj
(border, state)
,
(5)
rcmod
(state, border)
,
(6)
advmod
(populat, most)
,
(7)
amod
(state, populat)
,
(8)
dobj
(border, state)
Π
=
{
capital, state
}
Σ
=
{
are
}⇒Σ
=
φ
Π
=
{
state, border
}
Σ
=
{
border, state
}
Π
=
{
most, populat, state
}
Σ
=
φ
Σ
categories according to the following rules. For each grammatical relation
rel
(
gov,dep
)in
SDC
opt
q
:
1. If it is
ROOT
,
dep
is the key to populate
so add it to
Σ
and remove the
relation from
SDC
op
q
. This stem can be an auxiliary verb, e.g.,
is, are, has,
have
and so on. It is useless to build the arguments of the queries but it
could be used transitively to add other stems
2
.
2. If it starts with
nsubj
,checkif
gov
W
∈ Σ
. If not (because there isn't any
ROOT
relation) add
gov
to
Σ
. Then add
dep
to
Π
and remove
rel
from
SDC
op
q
,
otherwise keep it, since it could be a subject related to a subordinate (we
will need it in the recursive steps).
3. If it starts with
prep
or it ends with
obj
, we used it to create conditions
(possibly involving nesting):
-
check if
gov
∈ Π
. If not (because no
ROOT
or
nsubj
relations were
found so far) add
gov
to
Π
.
-
Then add
dep
to
Σ
if there is not any
table.column
like
3
gov.dep
.Oth-
erwise, also add
dep
to
Π
and remove
rel
from
SDC
op
q
.
4. If it ends with
mod
, it implies that
dep
is a modificator of
gov
, so they should
be paired together: if
gov
∈ Π
add
dep
to
Π
and remove
rel
from
SDC
op
q
. This should be done only if
dep
is not a
superlative (i.e. doesn't end with -st). The non-removed relations will be
taken into account in the recursive step, adding both
dep
and
gov
to
Π
.
5. If none of the above rules can be applied, iterate the algorthm recursively
building
Π
and
Σ
,
Π
and
Σ
and so on, until
SDC
opt
q
∈ Σ
add
dep
to
Σ
and if
gov
is empty.
In order to show how these steps are used to build projection and/or selec-
tion oriented sets from which we generate
S
and
W
, let us consider the list of
optimized dependencies
SDC
opt
q
1
in Table 1.
2
Stems of 3 or less characters would introduce too much noise in retrieving matching
strings, so they will be eliminated in an additional step 6. Useful words like
in, of,
not, or, and
are embedded in relation abbreviations when collapsing dependencies.
3
We query metadata seeking for something similar to
gov
as a table and
to
dep
as a column, i.e. we search for table names using
π
table name
(
σ
table name
=
dep∧column name
=
gov
(
IS.Columns
)). For brevity we use the symbol
s
1
=
s
2
for
s
2
substring of
s
1
, i.e.
s
1
LIKE ”%
s
2
%”.
Search WWH ::
Custom Search