QoS-Oriented Grid-Enabled Data Warehouses - Data Warehousing Design and Advanced Engineering Applications

Database Reference

In-Depth Information

infrastructure. Each site may share one or more

resources to the grid. Examples of possible shared

resources are storage systems, computer clusters

and supercomputers.

Data warehouses are usually deployed at a single

site. But that may not be the most effective layout

in a grid-based DW implementation. In fact, in such

environment, placing the entire database at a single

site would be more expensive and time consuming

than creating a distributed DW that uses the avail-

able distributed resources to store the database and

to execute users' queries. It is important to consider

that not only users are distributed across distinct

grid sites but also that the warehouse's data may

be loaded from several sites.

Hence, in the distributed Grid-based DW,

data is partitioned and/or replicated at nodes

from distinct sites and may be queried by any

grid participant.

grid-enabled warehouse. The users' local domain

is considered the first tier and stores cached data.

Database servers at remote sites compose the

second tier. The scheduling algorithm tries to

use the locally stored data to answer submitted

queries. If it is not possible, then remote servers

are accessed.

The Globus Toolkit is used by Wehrle et al

(2007) as an underlying infrastructure to imple-

ment a grid-enabled warehouse. Facts table data

is partitioned across nodes participating nodes

and dimension data is replicated. Some special-

ized services are used at each node: (i) an index

service provides information about locally stored

data; and (ii) a communication service is used to

access remote data. Locally stored data is used to

answer incoming queries. If the searched data is

not stored at the local node, then remote access

is done by the use of the communication service.

This strategy and the abovementioned Olap-

enabled strategy do not provide any autonomy

for local domains.

Best-Effort Approaches for

Grid-Enabled Warehouses

Distributed Data Placement

in QoS-Oriented DW

There are some previous works on implementing

and using grid-enabled data warehouses, but most

use best-effort oriented approaches, which may

not be the most adequate approach in grid based

systems (as presented in the previous Section, grid

scheduling is usually satisfaction-oriented).

High availability and high performance are the

main concerns by Costa & Furtado (2006). Each

participating site stores a partitioned copy of the

entire warehouse. Intra-site parallelism is obtained

by the use of the Node Partitioned Data Warehouse

(NPDW) strategy (Furtado, 2004). Hierarchical

scheduler architecture is used together with an

on-demand scheduling policy (idle nodes asks

the Central Scheduler for new queries to execute).

Such model leads to good performance and high

availability, but also consumes too much storage

space, as the whole warehouse is present at each

participating site.

The Olap-enabled grid (Lawrence & Rau-

Chaplin, 2006; Dehne et al, 2007) is a two tier

In data warehouses, users' queries usually follow

some kind of access pattern, like geographically

related ones in which users from a location may

have more interest in data related to such location

than in data about other locations (Deshpande

et al, 1998). That may also be applicable for the

grid. For instance, consider a global organization

that uses a grid-based DW about sales which is

accessed by users from several countries. The users

in New York City, USA, may start querying data

about sales revenue in Manhattan, and then do

continuous drill-up operations in order to obtain

information about sales in New York City, in New

York State and, finally, in the USA. Only rarely

New York users would query data about sales in

France. In the same way, users from Paris may

start querying the database about sales in France,

and then start doing drill-down operations in or-

Data Warehousing Design and Advanced Engineering Applications

Search WWH ::

Custom Search

Home