Database Reference
In-Depth Information
Table 2. Nominees for three awards
Id
Author
Journals
EDBT
ICDE
VLDB
SIGMOD
SFM
5932
Divesh Srivastava
19
6
37
24
32
7
8846
H. V. Jagadish
27
10
23
35
28
12
19660
Philip S. Yu
62
9
49
21
18
8
20259
Raghu Ramakrishnan
18
2
16
30
30
4
23870
Surajit Chaudhuri
19
3
27
26
39
10
other candidate in terms of these criteria. Thus, tuples in table Researcher must be selected in terms of
the values of EDBT, ICDE, VLDB and SIGMOD. Following these criteria, the nominees are computed,
and presented in Table 2; also the Skyline frequency metric (SFM) for each researcher is reported. In
total, DBLP database contains information at least 1.4 million publications (Ley, 2010).
Since the research institute only can grant three awards, it has to select the top-3 researchers among
the five nominees. Thus, criteria to discriminate the top-3 researchers among nominees are needed. The
number of journals may be used as a score function; therefore, three candidates are the new nominees:
19660, 8846 and 5932 (or 23870).
On the other hand, in the literature, several metrics have been proposed to distinguish the top-k ele-
ments in a set of incomparable researchers. For example, consider the skyline frequency metric (SFM)
that measures the number of times a researcher belongs to a skyline set when different sub-sets of the
conditions in the multi-dimensional criteria are considered. To compute SFM the algorithms presented
in (Yuan et al., 2005) may be applied. Both algorithms build non-empty subsets of multidimensional
criteria as shown in Table 3. However, the Skyline may be huge, and it will be completely built by these
algorithms (Goncalves and Vidal, 2009). Therefore, to calculate the skyline frequency values, a large
number of non-necessary points in all subsets of multidimensional criteria may be computed.
Based on the values of the SFM, three of the researchers 8846, 23870 and 19660 are the winners of
the research institute request. Intuitively, to select the awarded researchers, queries based on user pref-
erences have been posted against the table Researcher. Skyline (Börzsönyi et al., 2001) and Top-k
(Carey and Kossmann, 1997) are two user preference languages that could be used to identify some of
he granted researchers. However, none of them will provide the complete set, and post-processing will
be needed to identify the top-3 researchers (Goncalves and Vidal, 2009). To overcome limitations of
existing approaches, we propose a query evaluation algorithm that minimizes the number of non-nec-
essary probes, i.e., this algorithm is able to identify the top-k objects in the Skyline, for which there are
not k better Skyline objects in terms of the SFM.
Preliminaries
Given a set DO = {o 1 , …, o n } of database objects, where each object o i is characterized by p attributes
(A 1 , …, A p ); r different score functions s 1 , …, s q , …, s r defined over some of the p attributes, where
each s i : O → [0, 1], 1 ≤ i ≤ r; a score function f defined on some scores s i , which induces a total order
of the objects in DO ; and a multicriteria function m defined over a subspace S of the score functions s 1 ,
…, s q , which induces a partial order of the objects in DO . For simplicity, we suppose that scores related
to the multicriteria function need to be maximized, and the score functions s 1 , …, s q , …, s r respect a
 
Search WWH ::




Custom Search