Databases Reference
In-Depth Information
The subset should consist of relatively low-dimensional cuboids (that are commonly
queried) and cuboids that offer the most benefit to the user. The details are left to inter-
ested readers as an exercise. The method was tested on both real and synthetic data and
found to be efficient and effective in answering queries.
5.3.2 Ranking Cubes: Efficient Computation of Top- k Queries
The data cube helps not only online analytical processing of multidimensional queries
but also search and data mining. In this section, we introduce a new cube structure
called Ranking Cube and examine how it contributes to the efficient processing of top-k
queries . Instead of returning a large set of indiscriminative answers to a query, a top- k
query (or ranking query ) returns only the best k results according to a user-specified
preference.
The results are returned in ranked order so that the best is at the top. The user-
specified preference generally consists of two components: a selection condition and
a ranking function . Top- k queries are common in many applications like searching
web databases, k -nearest-neighbor searches with approximate matches, and similarity
queries in multimedia databases.
Example 5.16 A top- k query. Consider an online used-car database, R , that maintains the following
information for each car: producer (e.g., Ford, Honda), model (e.g., Taurus, Accord),
type (e.g., sedan, convertible), color (e.g., red, silver), transmission (e.g., auto, manual),
price, mileage, and so on. A typical top- k query over this database is
Q 1 : select top 5 * from R
where producer D “Ford” and type D “sedan”
orderby
2 C.
2 asc
.
price 10 K
/
mileage 30 K
/
Within the dimensions (or attributes) for R , producer and type are used here as selection
dimensions . The ranking function is given in the order-by clause. It specifies the rank-
ing dimensions , price and mileage . Q 1 searches for the top-5 sedans made by Ford. The
entries found are ranked or sorted in ascending ( asc ) order, according to the ranking
function. The ranking function is formulated so that entries that have price and mileage
closest to the user's specified values of $10K and 30K, respectively, appear toward the
top of the list.
The database may have many dimensions that could be used for selection, describ-
ing, for example, whether a car has power windows, air conditioning, or a sunroof. Users
may pick any subset of dimensions and issue a top- k query using their preferred rank-
ing function. There are many other similar application scenarios. For example, when
searching for hotels, ranking functions are often constructed based on price and distance
to an area of interest. Selection conditions can be imposed on, say, the hotel location
 
Search WWH ::




Custom Search