Automatic Generation and Reranking of SQL-Derived Answers to NL Questions - Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Information Technology Reference

In-Depth Information

For the remaining stems Σ R = Σ −Σ L we search for a match in the database:

∀col ∈ IS .Columns, ∀tab ∈ IS .Tables , generate

W R = x|π count ( ∗ ) ( σ col = x ( Geoquery.tab )) > =0 .

Then, in order to build the WHERE clause set,

∀e L ∈W L , ∀e R ∈W R we

first generate basic expressions expr = e L OP R and combine them by means

of conjunctions and negations (see Section 2.2), keeping only those expressions

expr such that the execution of π count ( ∗ ) ( σ expr ( table )) does not lead to an error

for at least a table in the database.

To understand how it works, let us introduce a new example question q 2 :

“ what are the capitals of states bordering New York? ”. The SDC opt

q 2

is sim-

ilar to SDC opt

q 1 except for the last three relations. Row (6) disappears while

rows (7) and (8) are replaced by amod (york, new) and dobj (border, york), lead-

ing to Σ =

. This set is split into Σ L =

{

border, new, york

}

{

border

}

and

Σ R =

{

new, york

}

W L = border info.border 3 ,border info.state name 2 and

We build

W R = new york 2 , new mexico 1 , new jersey 1 , newark 1 . Finally we gen-

erate the set of possible valid conditions and their weights:

W = {border info.border =' new york 5 ,border info.state name =' new york 4 , ...} .

Anyway, the set Σ R could happen to be empty. For example, when the

WHERE condition requires nesting: in this case e R will be the whole subquery

(e.g. Σ in Table 1). It could be the case that also Σ L is empty. In fact a

query without a WHERE clause is valid (e.g. Σ in Table 1). In any case,

even if there are no selection-based stems,

may not be empty (e.g. Σ in

Table 1). Taking into account all tables and columns we can get more con-

ditions:

{tab.col such that tab ∈ π table name ( IS.Columns )and col ∈

π column name ( IS.Columns )

* R =

}

3.5 Building the FROM Clauses Set

The generation of the FROM clause F is straightforward given S and W .This

set will contain all tables to which clauses in S and W refer, enriched by pairwise

joins.

As stated before, this information can be found running SQL queries over IS ex-

ploiting metadata stored in table KEY COLUMN USAGE (in short, K ;seeFig-

ure 2). This table identifies all columns in the current databases that are restricted

by some unique, primary key, or foreign key constraint. That is, for each usage

of foreign key column in the table, we can determine how many aggregate table

columns match that column usage. First, we extract tables appearing in

and

= F .Then

∀t 1 ,t 2 ∈ Fπ col name,ref col name ( σ table name = t 1 ∧ref table name = t 2 ( IS .K ))retrieves

c 1 ,c 2 to perform the: join t 1 t 2

(i.e. words ending with dot), creating a set F . At the beginning

in enriched whit the two-table

join t 1 join t 2 on t 1 .c 1 = t 2 .c 2 . In addition we can allow for more distant joins

by finding an intermediate table useful to link two tables that are not directly

referencing each other. This can be done performing a complex join between two

instances of Keys with multiple conditions, but due to for lack of space this can

not be illustrated here.

c 1 = c 2 .Inthisway

Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Search WWH ::

Custom Search

Home