Database Reference
In-Depth Information
4.1.2.4 Rule 4: Negation
If the query is
¬
Q
then:
Negation
P(
¬
Q)
=
1
−
P (Q)
(4.6)
4.1.2.5 Rule 5: Inclusion-Exclusion Formula
Suppose a query can be written as
Q
1
∧
Q
2
∧
...
∧
Q
k
. Then its probability is:
1
)
|
s
|
P(
i
∈[
s
]
P(Q
1
∧
Q
2
∧
...
∧
Q
k
)
=−
(
−
inclusion-exclusion
Q
i
)
(4.7)
s
⊆[
k
]
,s
=∅
This is the dual of the popular inclusion-exclusion formula since it expresses the probability
of an intersection of events in terms of the probability of unions of events. For example, when
k
=
3,
then the two inclusion-exclusion formulas are:
P(Q
1
∧
Q
2
∧
Q
3
)
=
P(Q
3
)
−
P(Q
1
∨
Q
2
)
−
P(Q
1
∨
Q
3
)
−
P(Q
2
∨
Q
3
)
+
P(Q
1
)
+
P(Q
2
)
+
Q
3
)
P(Q
1
∨
Q
2
∨
Q
3
)
=
P(Q
1
)
+
P(Q
2
)
+
P(Q
3
)
−
P(Q
1
∧
Q
2
)
−
P(Q
1
∧
Q
3
)
−
P(Q
2
∧
Q
3
)
+
P(Q
1
∧
Q
2
∧
Q
3
)
P(Q
1
∨
Q
2
∨
We use only the former because it brings the query expression in a form in which we can
apply
Eq. (4.3)
, then find a separator variable.
4.1.2.6 Rule 6: Attribute Ranking
This rule allows us to partition the tuples of a relation instance
R
, according to a simple predicate.
We consider two such predicates, leading to two flavors of the attribute ranking rule, which we
call
Attribute-Constant Ranking
and
Attribute-Attribute Ranking
. The term “ranking” refers to the
fact that we impose a partial order on the attributes; it should not be confused with tuple-ranking
discussed in
Section 6.1
.
Attribute-attribute ranking is the following. Let
A, B
be two attributes of
R
. Partition
R
into three relations,
R
1
=
σ
A<B
(R), R
2
=
σ
A
=
B
(R), R
3
=
σ
A>B
(R)
, and substitute in the query
Q
all atoms referring to
R
with
R
1
∨
R
3
. We denote
Q
r
the resulting query and say that
Q
r
R
2
∨
is obtained from
Q
by “ranking”
R
on the attributes
A, B
.
Attribute-constant ranking is the following. Let
A
be an attribute of
R
and
a
be a constant
1
such that there exists two unifiable atoms using the relation symbol
R
, such that the first has the
constant
a
in position
A
and the other has a variable in position
A
. Then we partition
R
into two
1
If
Q
is a non-Boolean query, then we also do ranking w.r.t. to head variables
a
. In other words, for the purpose of ranking, head
variables are treated as constants.