Database Reference
In-Depth Information
S
SSN
FID
P
T
FID
SSN
N
P
185
351
0 . 5
351
185
Smith
0 . 2
185
352
0 . 5
351
785
Smith
0 . 8
785
351
1
352
185
Brown
0 . 3
186
352
1
352
186
Brown
0 . 7
Figure 2.8: Normalized BID representation of the table R in Figure 2.4 (a). The representation consists
of two BID tables and a view definition, reconstructing R
from the two BID tables. The table R is
recovered by natural join, R =
S
T .
Of course, not every probabilistic table is a BID table; for example, none of the pc-tables in
Figure 2.4 (extended with non-trivial probabilities) is a BID table. However, BID tables are complete
when coupled with views expressed as conjunctive queries.
BID tables extended with CQ views are a complete representation system.
Proposition 2.18
Notice that we only need a view given by a conjunctive query in order to reconstruct the
probabilistic relation from its decomposition into BID tables. In fact, as we show in the proof, this
query is very simple, it just joins two tables. This is close in spirit to traditional schema normalization,
where every table is recovered from its decomposed tables using a natural join. By contrast, in
Proposition 2.16 , we needed a query whose size depends on the number of possible worlds.
={ W 1 ,...,W n
Proof. Let D
= ( W ,P) be a probabilistic database with n possible worlds W
}
.
Let p 1 ,...,p n be their probabilities; thus, p 1 + ... + p n =
1. Recall that we have assumed that
the schema consists of a single relation name, R( A) ; the proof extends straightforwardly to multiple
relations. Define the following BID database D =
,P) . The first relation is deterministic,
and has schema S(K, A) ; the second relation is a BID table and has schema W(K) : note that the
key consists of the empty set of attributes, meaning that all tuples in W are disjoint, i.e., a possible
world for W contains at most one tuple. Let k 1 ,...,k n
(
S,W
be n distinct constants. Define the content
of W as W
={
k 1 ,...,k n }
, and set the probabilities to P(k i )
=
p i , for i
=
1 ,n (since these tuples
are disjoint, we must ensure that i P(k i ) =
1; indeed, this follows from i p i =
1). Define the
content of S as S ={ k 1 R 1
... ∪{ k n R n . (Recall that R i
is the relation R in world W i .) It
is easy to check that the conjunctive-query view
R(x 1 ,...,x m ) :- S(k,x 1 ,...,x m ), W (k)
defined on database D is precisely D .
While the construction in the proof remains impractical, in general, because it needs a BID
table as large as the number of possible worlds, in many concrete applications, we can find quite
natural and efficient representations of the probabilistic database as views over a BID tables. We
 
Search WWH ::




Custom Search