Evaluating Top-k Skyline Queries Efficiently - Advanced Database Query Systems

Database Reference

In-Depth Information

Chapter 4

Evaluating Top-k Skyline

Queries Efficiently

Marlene Goncalves

Universidad Simón Bolívar, Venezuela

María Esther Vidal

Universidad Simón Bolívar, Venezuela

ABSTRACT

Criteria that induce a Skyline naturally represent user's preference conditions useful to discard ir-

relevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size

of the Skyline can still be very large. To identify the best k points among the Skyline, the Top-k Skyline

approach has been proposed. This chapter describes existing solutions and proposes to use the TKSI

algorithm for the Top-k Skyline problem. TKSI reduces the search space by computing only a subset of

the Skyline that is required to produce the top-k objects. In addition, the Skyline Frequency Metric is

implemented to discriminate among the Skyline objects those that best meet the multidimensional crite-

ria. This chapter's authors have empirically studied the quality of TKSI, and their experimental results

show the TKSI may be able to speed up the computation of the Top-k Skyline in at least 50% percent

with regard to the state-of-the-art solutions.

INTRODUCTION

Emerging technologies such as Semantic Web, Grid, Semantic Search, Linked Data and Cloud and Peer-

to-Peer computing have become available very large datasets. For example, by the time this paper has

been written at least 21.59 billion pages are indexed by the Web (De Kunder, 2010) and the Cloud of

Linked Data has at least 13,112,409,691 triples (W3C, 2010). The enormous growth in the size of data

has a direct impact on the performance of tasks that are required to process on very large datasets and

Search WWH ::

Custom Search

Home