Database Reference
In-Depth Information
Figure 4.10(d) shows the scalability of the four algorithms with respect to
database size. RT has a linear scalability. LT3 clearly has the better performance
and scalability than DLTA on large data sets.
The trends for discriminative typicality queries and representative typicality
queries are similar, as shown in Figure 4.11 and Figure 4.12, respectively.
In summary, as RT has linear complexity, when runtime is the only concern,
RT should be used. While DLTA and LT3 are much more scalable than the exact
algorithm and are more accurate than RT, they are good when both accuracy and
efficiency matter. LT3 has the better efficiency and scalability than DLTA, while
achieving comparable accuracy to DLTA.
4.6 Summary
In this chapter, we discussed how to compute the answers to top- k typicality queries
defined in Section 2.2.1.
We applied kernel density estimation to compute the likelihood of instances in an
uncertain object and presented an exact query answering algorithm. We showed
that the complexity of answering top- k typicality queries is quadratic.
We developed a linear-time randomized algorithm which adopts a tournament
mechanism and computes the approximation answers to top- k typicality queries.
The randomized algorithm does not provide a quality guarantee in the approxi-
mated answers, therefore, we further explored the locality nature of kernel den-
sity estimation and proposed two approximation algorithms that can provide
good quality guarantees.
By a systematic empirical evaluation using both real data sets and synthetic data
sets, we illustrated the effectiveness of top- k typicality queries, and verified the ac-
curacy and the efficiency of our methods.
Search WWH ::




Custom Search