QoS-Oriented Grid-Enabled Data Warehouses - Data Warehousing Design and Advanced Engineering Applications

Database Reference

In-Depth Information

grids and grid-enabled databases. Then, we discuss

QoS-oriented scheduling and placement strategies

for the Grid-based warehouse. In the following,

we present some experimental results. Next, we

draw conclusions. At the end of the chapter, we

present some key terms definitions.

acteristics when scheduling job execution may

become very time consuming. In the hierarchical

architecture, a Community Scheduler (or Resource

Broker ) is responsible to assign job execution to

sites. Each site has its own job scheduler ( Local

Scheduler ) which is responsible to locally schedule

the job execution. The Community Scheduler and

Local Schedulers may negotiate job execution

and each Local Scheduler may implement local

resource utilization policies. Besides that, as the

Community Scheduler does not have to exactly

know the workload and characteristics of each

available node, this model leads to greater scal-

ability than the centralized scheduling model.

In the Decentralized model, there is no Central

Scheduler. Each site has its own scheduler, which

is responsible to schedule local job execution.

Schedulers must interact to each other in order to

negotiate remote job execution. Several messages

may be necessary during the negotiation in order

to do good job scheduling, which may impact the

system's performance.

Some of the GRM systems have built-in

scheduling policies, but almost all enable the

user to implement its own scheduling policy or to

use application-level schedulers. In this context,

some general purpose application level schedul-

ers were designed [e.g. Condor-G (Frey et al,

2001) and Nimrod-G (Buyya et al, 2000)]. These

general purpose generally consider some kind

of user-specified requirement or QoS-parameter

(e.g. job's deadline), but may fail to efficiently

schedule data-bound jobs.

Query scheduling strategies for data-bound

jobs were evaluated by Ranganathan & Foster

(2004). Data Present (DP), Least Loaded Sched-

uling (LLS) and Random Scheduling (RS) were

compared. In RS, job execution is randomly

scheduled to available nodes. In LLS, each job

is scheduled to be executed by the node that has

the lowest number of waiting jobs. Both in RS

and LLS, a data-centric job may be scheduled

to be executed by a job that does not store the

required data to execute such job. In this case,

DATA GRIDS AND GRID-

ENABLED DATABASES

The Grid is an infra-structure that provides

transparent access to distributed heterogeneous

shared resources, which belong to distinct sites

(that may belong to distinct real organizations).

Each site has some degree of autonomy and may

impose resource usage restrictions for remote

users (Foster, 2001).

In the last decade, some Grid Resource Man-

agement (GRM) Systems [for example, Legion

(Grimshaw et al, 1997) and Globus Toolkit (Foster

& Kesselman, 1997)] were developed in order

to provide some basic functionality that is com-

monly necessary to run grid-based applications.

Authorization and remote job execution manage-

ment are among the most common features in

GRM systems. Some of them also provide data

management-related mechanisms, like efficient

data movement [e.g. GridFTP (Allcock et al,

2005)] and data replica location [e.g. Globus

Replica Location Service - RLS (Chervenak et

al, 2004)].

In terms of grid job scheduling, there are three

basic architectures (Krauter et al (2002): central-

ized , hierarchical and decentralized . In the first

one, a single Central Scheduler is used to schedule

the execution of all the incoming jobs, assigning

them directly to the existent resources. Such ar-

chitecture may lead to good scheduling decisions,

as the scheduler may consider the characteristics

and loads of all available resources, but suffers

from a scalability problem: if a wide variety of

distributed heterogeneous resources is available,

considering all the resources' individual char-

Data Warehousing Design and Advanced Engineering Applications

Search WWH ::

Custom Search

Home