Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

5.2.2 Scan Reduction by Prefix Sharing

We scan the dominant set S t i of each tuple t i ∈

T once and computes the subset

probabilities Pr

. Can we reduce the number of scans of the sorted tuples to

improve the efficiency of query evaluation?

To compute Pr k

(

S t i ,

)

, the order of tuples in S t i

does not matter. This gives us the flexibility to order tuples in compressed dominant

sets of different tuples so that the prefixes and the corresponding subset probability

values can be shared as much as possible. In this section, we introduce two reorder-

ing methods to achieve good sharing.

(

t i )

using subset probability Pr

(

S t i ,

)

5.2.2.1 Aggressive Reordering

Consider the list L

t 1 ···

t n of all tuples in T and a tuple t i in L . Two observations

help the reordering.

First, for a tuple t that is independent or is a rule-tuple of a completed rule with

respect to t i (Case 2 in Section 5.1.3), t is in T

for any tuple t f t i . Thus, t

should be ordered before any rule-tuple of a rule open with respect to t i (Case 3 in

Section 5.1.3).

Second, there can be multiple rules open with respect to t i . Each such a rule R j

has a rule-tuple t R j left , which will be combined with the next tuple t ∈

t )

(

R j to update

the rule-tuple. Thus, if t is close to t i , t R j left should be ordered close to the rear so

that the rule-tuple compression affects the shared prefix as little as possible. In other

words, those rule-tuples of rules open with respect to t i should be ordered in their

next tuple indices descending order.

An aggressive reordering method to reorder the tuples is to always put all inde-

pendent tuples and rule-tuples of completed rules before rule-tuples of open rules,

and order rule-tuples of open rules according to their next tuples in the rules.

We scan all tuples in T in the ranking order. Two buffer lists, L complete and L open ,

are used to help aggressive reordering. L complete contains all independent tuples or

completed rule-tuples, while L open contains all open rule-tuples during the scan.

Both L complete and L open are initialized to 0 before the scan.

When scanning tuple t i , we compute the compressed dominant set of t i , and up-

date L complete and L open according the following two cases.

Case 1: If t i is an independent tuple, then the compressed dominant set of t i con-

tains all tuples in L complete and L open . Moreover, we put t i into L complete , meaning

that t i will appear in the compressed dominant set of all tuples ranked lower than

t i .

Case 2:

t r m , then the

compressed dominant set of t i contains all tuples in L complete and L open , except

for the rule-tuple t R left

If t i is involved in a multi-tuple generation rule R : t r 1 ,...,

in L open , where t R left

is the rule-tuple compressed from all

tuples in R that are ranked higher than t i .

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home