Database Reference
In-Depth Information
All the above-cited works assume that the basic primitives for repairing inconsis-
tent data are tuple insertions and deletions. Repairs consisting of also updating at-
tribute values were considered in [28, 12, 13, 48, 49, 50]. In particular, [48] was the
first investigating the complexity of the consistent query answer problem in a setting
where the basic primitive for repairing data is the attribute-value update. Although
the basic primitive used to find a repair is that of updating attribute values, repairs
consisting of also tuple insertions and deletions can be obtained in [48, 49, 50].
In [50] it was shown that for full dependencies and conjunctive queries, repairs for
a database can be summarized into a single tableau called nucleus for the database.
As a consequence, the consistent answers of conjunctive queries can be evaluated
by looking at the nucleus only.
[28, 12, 13] mainly focus on the problem of computing repaired databases rather
than evaluating consistent query answers. This problem is relevant in several con-
texts, where users are interested in a consistent version of the data on which an
analysis task can be accomplished (a context where users need to perform such
tasks is introduced in Section 1.2). In [28], the computation of repairs was studied
for categorical data in the presence of constraints expressed as first order formu-
las. Specifically, the DLV system [41] was exploited for computing repairs w.r.t.
first-order constraints expressing edit rules of questionnaires collecting census data.
In [13], repairs on categorical data in the presence of functional and inclusion de-
pendencies were studied, and a cost model for computing database repairs as set
of value modifications was introduced. The authors observed a strong connection
between the database repairing area and the record linkage field (also known as du-
plicate removal or merge-purge ), which refers to the task of linking pairs of records
that refer to the same entity in different data sets [20, 54]. This connection between
searching for a repair for an inconsistent database and the record linkage task also
motivates the automatic computation of repairs. The problem of checking whether
a repair is reasonable according to a minimality criterion (namely, repair check-
ing problem ) was addressed in [2], where several repair minimality semantics were
investigated.
In [12] the problem of repairing databases by fixing numerical data at attribute
level was addressed. The authors introduced the notion of Least-Square repair mini-
mizing the square Euclidean distance between the original and repaired database in-
stance, and shown that deciding the existence of Least-Square repairs under both de-
nial constraints (where built-in comparison predicates are allowed) and a non-linear
form of multi-attribute aggregate constraints is undecidable. Then they disregarded
aggregate constraints and focused on the problem of repairing data violating de-
nial constraints, where no form of aggregation is allowed in the adopted constraints.
After providing some intractability results, it was shown that for local denial con-
straints, Least-Square repairs can be computed by solving particular instances of
the Minimum Weighted Set Cover Optimization Problem that admit approximation
within a constant factor.
Aggregate constraints on numerical data was first investigated in [46], where the
consistency problem of very general forms of aggregation was considered, but no
issue related to data-repairing was investigated. The form of aggregate constraints
Search WWH ::




Custom Search