Consistent Answers to Aggregate Queries - Repairing and Querying Databases under Aggregate Constraints

Database Reference

In-Depth Information

our approach, we extend some of the complexity results provided in Chapter 3 for

atomic queries to aggregate ones. Specifically, the following theorem characterizes

the complexity of the problem of computing range-consistent answers of aggregate

queries in the presence of general and steady aggregate constraints.

Theorem 5.1. Let

be a fixed database scheme,

a fixed set of aggregate con-

straints on

, q a fixed aggregate query on

, D an instance of

, and

[ ,

]

interval. Then:

i) deciding whether CQA q

D,AC (

) =

0 is NP-complete;

ii) deciding whether CQA q

2 [

D,AC (

) ⊆ [ ,

]

log n

]

-complete;

iii) the above results still hold in the case that

is steady.

Hence, the range-CQA problem is hard also when the aggregate constraints are

steady. In fact, the loss in expressiveness yielded by the steadiness restriction not

only has no dramatic impact on the practical usefulness of aggregate constraints (as

explained in the Chapter 2), but also on the computational complexity of the range-

CQA problem. We have shown in Chapter 4 how the steadiness restriction can be

exploited to devise a technique for computing card -minimal repairs. In this section,

we show how this restriction can also be exploited to come up with a technique for

computing range-CQAs.

In the following we will use the notations introduced in Sections 4.2 and 4.3.

In order to define our technique for computing range-CQAs, we exploit the trans-

lation introduced in Section 4.2 which allows us to map a database scheme

, a set

of steady aggregate constraints

, and an instance D of

into a set of linear

inequalities

S ( D,AC,

)

such that every (possibly not-minimal) repair for D w.r.t.

. Specifically, as in this chapter we

assume that measure attributes are bounded by M , it follows that the one-to-one cor-

respondence between repairs for D w .r.t.

corresponds to a solution of

S ( D,AC,

)

and solutions of

S ( D,AC,

)

holds

only for those solutions which are M -bounded.

Corollary 4.1 in Section 4.3 states that the above-cited translation allows us to

encode the problem of computing the cardinality of card -m in imal repairs as an

MILP instance. However, as in this chapter we are assuming M -bounded measure

attributes, the value of constant M in Definition 4.1, which bounds both va riables z i

and w i , do not depends on

and D . It is easy to see that, assuming M -bound ed

measure attributes entails that in Definiti on 4.1 variables z i are bounded by M 1

max i ∈I |

and variables w i are bounded by M 2

v i

. Thus, in this chapter

we consider the MILP problem MILP

where the constant M in the fifth

and sixth inequalities of Definition 4.1 is replaced by M 1 , and the constant M in

the third and fourth inequalities of Definition 4.1 is replaced by M 2 . Clearly, these

changes need to be propagated also to the optimization problem OPT

( D,AC,

)

of Definition 4.5 which is exploited in Corollary 4.1 to calculate the cardinality of

card -minimal repairs.

We are now ready show how range-CQAs can be computed by solving MILP in-

stances. The following corollary, which straightforward follows from Theorem 4.1,

addresses the the case that the range-CQA is the empty interval (i.e., there is no

repair for the given database w.r.t. the given set of aggregate constraints).

( D,AC,

Repairing and Querying Databases under Aggregate Constraints

Search WWH ::

Custom Search

Home