Information Technology Reference
In-Depth Information
Tabl e 1. Categorizing stems into projection and/or selection oriented sets
(1) root (ROOT, are) ,
(2) nsubj (are, capital) ,
(3) prep of (capital, state) ,
(4) nsubj (border, state) ,
(5) rcmod (state, border) ,
(6) advmod (populat, most) ,
(7) amod (state, populat) ,
(8) dobj (border, state)
Π = { capital, state }
Σ = { are }⇒Σ = φ
Π = { state, border }
Σ = { border, state }
Π = { most, populat, state }
Σ = φ
Σ categories according to the following rules. For each grammatical relation
rel ( gov,dep )in SDC opt
q
:
1. If it is ROOT , dep is the key to populate
so add it to Σ and remove the
relation from SDC op q . This stem can be an auxiliary verb, e.g., is, are, has,
have and so on. It is useless to build the arguments of the queries but it
could be used transitively to add other stems 2 .
2. If it starts with nsubj ,checkif gov
W
∈ Σ . If not (because there isn't any ROOT
relation) add gov to Σ . Then add dep to Π and remove rel from SDC op q ,
otherwise keep it, since it could be a subject related to a subordinate (we
will need it in the recursive steps).
3. If it starts with prep or it ends with obj , we used it to create conditions
(possibly involving nesting):
- check if gov
∈ Π . If not (because no ROOT or nsubj relations were
found so far) add gov to Π .
- Then add dep to Σ if there is not any table.column like 3 gov.dep .Oth-
erwise, also add dep to Π and remove rel from SDC op q .
4. If it ends with mod , it implies that dep is a modificator of gov , so they should
be paired together: if gov
∈ Π add dep to
Π and remove rel from SDC op q . This should be done only if dep is not a
superlative (i.e. doesn't end with -st). The non-removed relations will be
taken into account in the recursive step, adding both dep and gov to Π .
5. If none of the above rules can be applied, iterate the algorthm recursively
building Π and Σ , Π and Σ and so on, until SDC opt
q
∈ Σ add dep to Σ and if gov
is empty.
In order to show how these steps are used to build projection and/or selec-
tion oriented sets from which we generate
S
and
W
, let us consider the list of
optimized dependencies SDC opt
q 1
in Table 1.
2 Stems of 3 or less characters would introduce too much noise in retrieving matching
strings, so they will be eliminated in an additional step 6. Useful words like in, of,
not, or, and are embedded in relation abbreviations when collapsing dependencies.
3 We query metadata seeking for something similar to gov as a table and
to dep as a column, i.e. we search for table names using π table name
( σ table name = dep∧column name = gov ( IS.Columns )). For brevity we use the symbol
s 1 = s 2 for s 2 substring of s 1 , i.e. s 1 LIKE ”% s 2 %”.
 
Search WWH ::




Custom Search