Database Reference
In-Depth Information
ison operator in
{ = , = ,≤,≥,<,>}
, we say that a tuple t over R satisfies
β
(denoted
| = β
β
[
]
β
as t
) if replacing the occurrences of each attribute A in
with t
A
makes
true.
Domains
Q
Z
and
will be said to be numerical domains , and attributes defined
over
will be said to be numerical attributes . Given a relation scheme R ,
we will denote the set of its numerical attributes representing measure data as
Q
or
Z
M R
(namely, measure attributes ). That is,
R specifies the set of attributes represent-
ing measure values, such as weights, lengths, prices, etc. For instance, in “Balance
Sheet” example,
M
M R consists only of attribute Value . Given a database scheme
D
,
we will denote as
M D the union of the sets of measure attributes associated with
the relation schemes in
.
On each relation scheme R , a key constraint is assumed. Specifically, we denote
D
as
A R consisting of the names of the attributes which are a key for
R . For instance, in “Balance Sheet” example,
K R the subset of
. We also
denote the key of a relation scheme by underlining its key attributes. Throughout this
topic, we assume that
K R = {
Year , Subsection
}
0, i.e., measure attributes of a relation scheme R
are not used to identify tuples belonging to instances of R . Although this assumption
leads to a loss of generality, it is acceptable from a practical point of view, since
the situations excluded by this assumption are unlikely to occur often in real-life
scenarios. Clearly, this assumption holds in the scenario considered in “Balance
Sheet” example.
We distinguish among measure and non-measure attributes as, in our framework,
we will rely on the assumption that inconsistencies involve measure attributes only,
whereas non-measure attributes are assumed to be consistent. Therefore, also key
constraints are assumed to be satisfied. The rationale of this assumption is that, in
many real-life situations, even if integrity violations of measure data can coexist
with integrity violations involving non-measure data, these inconsistencies can be
fixed separately. For instance, in the balance sheet scenario of our running example,
errors in the OCR-mediated acquisition of non-measure attributes (such as lacks of
correspondences between real and acquired strings denoting item descriptions) can
be repaired in a pre-processing step using a dictionary, by searching for the strings
in the dictionary which are the most similar to the acquired ones. In fact, in [21],
a system prototype adopting such a dictionary-based repairing strategy for string
attributes is described. The study of the problem of repairing the data when these
different forms of inconsistencies cannot be fixed separately is left as an open issue
and it will be discussed together other possible extensions in Chapter 6.
K R ∩M R =
2.2 Domain Constraints and Aggregate Constraints
Several forms of constraints can be defined over a database scheme restricting the
set of its valid instances. In this topic we deal with two forms of constraints: domain
constraints and aggregate constraints . The former impose that, if an attribute is
associated with a domain
Δ
in the definition of a relation scheme, then it must take
Search WWH ::




Custom Search