Database Reference
In-Depth Information
U c
L c
t 2
t 1
t 3
t 4
t 5
p
0
1
Example
Empty critical region
L c
U c
L c
U c
L c
U c
Case 1
Case 2
Case 3
Figure 6.1: Illustration of the multi-simulation technique. The goal is to retrieve the top k
=
2 highest
probabilities. Intervals represent uncertainty about the value of a tuple's probability score.
We assume that during any simulation step, no lower bound L i decreases, and no upper bound
U i increases. It is easy to check that this holds for Algorithm 3 ; for the Monte Carlo approximation
algorithms in Subsection 5.3.2 , while it fails if we define a simulation step to consis ts of one single
Monte Carlo trial, the assumption still holds with high probability is we perform N trials during
any simulation step [ Ré et al. , 2007 ]. This assumption implies immediately that L c never decreases,
and U c never increases. Thus, the critical region (L c ,U c ) will only shrink after a simulation step, or
stay the same.
Next, we define four sets of tuples as follows:
B k ={ i | U i L c
T k ={ i | U c
}
L i }
B k ={ i | U i U c
T k ={ i | L c
}
L i }
If the critical region is non-empty ( L c <U c ), then T k T k and B k B k ; if the critical region
is empty, then the opposite inclusions hold.
It is always the case that
| T k |≥ k because, by the definition of L c
=
max k (L 1 ,...,L n ) ,
there exists k lower bounds L i such that L i L c . By duality, it follows that
| B k |≥ n k since
U c
=
max k + 1 (U 1 ,...,U n ) =
min n k (U 1 ,...,U n ) .
 
Search WWH ::




Custom Search