Database Reference
In-Depth Information
Data and combined complexity
When we deal with database-motivated questions, it is important to specify precisely what
is viewed as an input to a problem. Consider, for now, query evaluation. A typical formu-
lation of this problem is as follows:
P ROBLEM : Q UERY EVALUATION
I NPUT : A query Q , a database D , a tuple t .
Q UESTION : D s t belong to Q ( D )?
We refer to the complexity of this problem as the combined complexity of query evaluation,
as both the query Q ,andthedata- D and t - are included in the input.
In the database setting, however, queries are typically very small compared to data.
Hence, it makes sense to measure the complexity in terms of the data: this is what we
often are interested in. This is captured by the following modification of the query evalua-
tion problem:
P ROBLEM : Q UERY EVALUATION ( Q )
I NPUT : A database D , a tuple t .
Q UESTION : D s t belong to Q ( D )?
We refer to the complexity of this problem as the data complexity of query evaluation,
since the input consists only of data.
The two complexities can be dramatically different. For example:
For relational calculus (i.e., FO), the combined complexity is P SPACE -complete, while
the data complexity is in AC 0 .
For conjunctive queries (and their unions), the combined complexity is NP-complete,
while the data complexity is in AC 0 .
In fact, it is very common to have an exponential gap between data and combined com-
plexities. To give an intuitive explanation of where it comes from, one can see that a query
evaluation algorithm runs, very roughly, in time
O ( Q ) ,where
refers to the size of
Q . Thus, for a fixed Q , it is polynomial, but when Q is a part of the input, it takes exponen-
tial time. A careful implementation of this naıve algorithm results in P SPACE complexity.
For conjunctive queries of the form
D
Q
, one needs to guess witnesses for the existential
quantifiers, and then check if they make the formula
x
α
true, which gives an NP algorithm.
When we consider problems related to data exchange, the input generally will be split
into two parts: the data part, involving instances (source and target), and the nondata part,
involving schema mappings and queries. If only data is viewed as the input, we shall be
talking about data complexity; if mappings and queries are part of the input as well, we
shall be talking about combined complexity.
α
Search WWH ::




Custom Search