Database Reference
In-Depth Information
Histogram of Multimerge Access Time
80
60
40
20
0
0
0.05
0.1
0.15
0.2
0.25
Average Time (msec)
Histogram of Forward Index Access Time
60
45
30
15
0
0
2
4
6
8
Average Time (msec)
FIGURE 10.24 :Like t scan , t forward is concentrated and can be reasonably
replaced by a point estimate.
assumption that the selectors occur independently of the candidates, we see
k = k corpusCount ( g )
corpusCount ( a )
(10.9)
as a natural and simple choice, using which we can write the query bloat
factor as
corpusCount ( g )
corpusCount ( a ) 2 .
We call this queryBloat ( a, g ), the bloat because a had to be generalized to a
given g .Foragiven R ,wecannowwrite
queryBloat ( a, R )= 1 ,
corpusCount ( g )
corpusCount ( a ) + k t forward
t scan
if a
R
(10.10)
R,a IsA g queryBloat ( a, g ) ,
min
otherwise
g
Note that at query execution time the choice of g from a given R is simple,
but choosing a good R ahead of time is non-trivial.
Figure 10.25 shows a study of estimated bloat compared to observed bloat.
The fit is not nearly as good as with the other half of our model in Figure 10.22 ,
 
 
Search WWH ::




Custom Search