Database Reference
In-Depth Information
R
FID
SSN
N
P
351
185
Smith
0.2
351
785
Smith
0.8
352
185
Brown
0.3
352
186
Brown
0.7
Figure 2.7: A BID table. There are two blocks (defined by the FID attribute): tuples in each block are
disjoint, while tuples across blocks are independent. Note that in every possible world the attribute FID
is a key (see Figure 2.3 ); this justifies underlining it.
not have negation. Next, one can check that, for any monotone query Q ,if W is an incomplete
database with a maximal element, then Q( W ) is also an incomplete database with a maximal
element. Indeed, apply Q to the maximal world in W : the result is a maximal world in Q( W ) ,
by the monotonicity of Q . Therefore, tuple-independent tables extended with UCQ views can only
represent probabilistic databases that have a maximal possible world. Since there exists probabilistic
databases without a maximal element (for example, consider the c-table in Figure 2.4 (b) and extend
it with an arbitrary probability distribution on the assignments of X, Y ), the claim of the proposition
follows.
2.7.2 BID DATABASES
A block-independent-disjoint database , or BID database, is a probabilistic database where the set of
possible tuples can be partitioned into blocks, such every block is included in a single relation 5 , and
the following property holds: all tuples in a block are disjoint probabilistic events, and all tuples from
different blocks are independent probabilistic events. If the database consists of a single table, then
we call it a BID-table. For a simple example, every tuple-independent table is, in particular, a BID
table where each tuple is a block by itself.
Every BID table can be represented by a simple pc-table where all possible tuples in the same
block, t 1 ,t 2 ,... are annotated with atomic events X = 1, X = 2, ... , where X is a unique variable
used only for that block; this is shown, for example, in Figure 2.2 . Furthermore, set the probabilities
as p i = P(X = i) = P(t i W) .
In practice, we use a simpler representation of a BID table R , as follows. We choose a set of
attributes A 1 ,A 2 ,... of R that uniquely identify the block to which the tuple belongs: these will
be called key attributes because they form, indeed, a key in every possible world. Then, we add a
probability attribute P . Thus, the schema of a BID table is R( A 1 ,...,A k ,B 1 ,...,B m ,P) . For an
illustration, consider representing the BID table for Figure 2.2 : this is shown in Figure 2.7 . The
attribute FID uniquely identifies a block (this is why it is underlined); tuples within a block are
disjoint, and their probabilities add up to 1 . 0, while tuples across blocks are independent. To help
visualize the blocks, we separate them by horizontal lines.
5 That is, a block cannot contain two tuples from two distinct relations R and S .
 
Search WWH ::




Custom Search