Database Reference
In-Depth Information
Table 5. The TKSI Algorithm Execution
Id
Publication/GPA/Experience
Index
8846
12
I 5
8846
10
I 1
19660
9
I 1
Property 1. Given a set of sorted indices I 1 , …, I q on each attribute of multidimensional criteria m ; an
index I q+1 defined on values of SFM sf ; the Skyline SKY S ; an integer k such as 0 < k ≤ |SKY S |; and object
o indexed by I q+1 . Then, o is an Top-k Skyline object if not exists an object above o in all indices I 1 , …,
I q and not exists Top k - 1 Skyline objects with higher SFM value than it.
TKSI focuses on performing sorted accesses on the SFM index I q+1 firstly and then, verifying if
each accessed object is a Top-k Skyline using the indices on multidimensional criteria I i , …, I q . Basi-
cally, TKSI receives a set of indices on each attribute of multidimensional criteria m and the Skyline
Frequency metric sf , and an integer k ; and it builds the Top-k Skyline using the indices (Table 4). Since
the indices on multidimensional criteria are sorted, TKSI has not to scan the entire index and builds the
whole skyline while k is smaller than the Skyline size.
Following with our example, TKSI accesses the objects from I 5 sequentially until the top-1 object is
produced. For each accessed object o from I 5 , TKSI verifies that o is a Top-k Skyline object. Because
objects are sorted, it is very likely that any object with the higher values in each index of function m
dominates the next objects in the indices. For this reason, TKSI must select one of the indices I 1 , I 2 , I 3 ,
or I 4 in order to minimize the necessary probes over the multicriteria function m . The objects could be
accessed in a round robin fashion. However, in order to speed up the computation, TKSI determines
what is the index whose distance with respect to o is the lowest, i.e., the index that will avoid the access
of more non-necessary objects. To do this, TKSI computes the distance D 1 as the difference between
the last seen value from I 1 and the value for EDBT of o (min 1 - s 1 (o)), D 2 as the difference between the
last seen value from I 2 and the value for ICDE of o (min 2 - s 2 (o)), D 3 as the difference between the last
seen value from I 3 and the value for VLDB of o (min 3 - s 3 (o)), and D 4 as the difference between the
last seen value from I 4 and the value for SIGMOD of o (min 4 - s 4 (o)). Next, TKSI selects the minimum
value between D 1 , D 2 , D 3 , and D 4 .
Initially, TKSI accesses the first object 8846 from I 5 , and their values for EDBT, ICDE, VLDB and
SIGMOD randomly. Because of the objects from I 1 , I 2 , I 3 , and I 4 have not been seen yet; TKSI assumes
the last seen value is the maximum value possible for the attribute. Therefore, the best distance between
D 1 = 10 - 10, D 2 = 49 - 23, D 3 = 35 - 35, and D 4 = 39 - 28 is calculated. In this case, I 1 and I 3 have the
minimum distance. Note that 8846 is placed in the indices I 1 and I 3 in a lower position than the same
object in I 2 and I 4 . The objects of I 1 are accessed until the object 19660 with a value lower in EDBT is
found. All these objects are compared against 8846 to verify if some of them dominate it. Since, none
of the objects dominates 8846 , the object 8846 is a Top-k Skyline object. If some object indexed by I 1
dominates 8846 , a new object from I 5 is accessed. However, the algorithm decides to stop here because
the objects behind 19660 have worse values in Publication than 8846 , and they may not dominate 8846 .
The detailed TKSI execution is showed in Table 5.
Search WWH ::




Custom Search