QoS-Oriented Grid-Enabled Data Warehouses - Data Warehousing Design and Advanced Engineering Applications

Database Reference

In-Depth Information

according to the specified SLO ( d ≥ tet ), it starts

query execution. Otherwise, query execution is not

started and user is informed that the established

SLO cannot be achieved.

In order to predict the tet value, the system

estimates the execution time of each of the query's

tasks (task finish time - tft ), considering three key

time components for each query (task):

On the other hand, the Community Scheduler

does not have control of intra-set data placement

and query execution. Therefore, it is somewhat

difficult to make such module estimate tasks'

execution time. Such estimation is done by local

schedulers. In fact, in QoS-oriented scheduling,

the Community Scheduler does not have to know

the exactly necessary time to execute a query:

local schedulers must commit themselves to

execution the assigned queries by a certain time

interval. Such interval is the maximum value that

let (mlet) can assume in order to finish the user's

query execution by the specified SLO.

Hence, for each task, the Community Scheduler

computes the mlet value (Equation 4) and uses

such value as a task deadline when negotiating

with local schedulers.

(i) The query execution time at a local site (local

execution time - let );

(ii) The necessary time to transfer required data

to the site (data transfer time - dtt );

(iii) The necessary time to transfer the query's

results back from the chosen site (results

transfer time - rtt ).

The tft value of a single task at a certain site

is computed by Equation 1. An upper bound es-

timated value for the users' query execution ( tet )

is obtained by Equation 2.

mlet

£- +

d tt

(

rtt

)

(4)

Figure 4 presents a general view of the SLO-

aware scheduling model.

tft = let + dtt + rtt

(1)

tet = Max ( tft )

(2)

LOCAL SCHEDULERS AND

SERVICE LEVEL AGREEMENTS

To estimate the value of tft , the Community

Scheduler must have some estimative of its com-

ponents. First of all, it predicts the values of dtt

and rtt , with the support of a grid infrastructure

network monitor tool [like the Network Weather

Service - NWS - (Wolski, 1997)]. Such tool

is used to predict network latency ( L ) and data

transfer throughput ( TT ) between sites. The Com-

munity Scheduler uses such predicted values for

network characteristics together with estimated

dataset sizes (obtained by database statistics) to

predict dtt and rtt (a predicted transfer time ( tbs )

of a dataset of size z between sites i and j can be

obtained by Equation 3).

When estimating if a user's query can be ex-

ecuted by the proposed deadline, the Community

Scheduler must consider the necessary time to

execute each rewritten query at local sites ( let ).

But not only the Community Scheduler does not

have total control of the execution environment,

but also each site can have local domain policies

that can constraint the use of local resources by

remote users. Therefore, the necessary time to

execute each task should be predicted by local

schedulers.

But in QoS-oriented scheduling, each site

may not inform the Community Scheduler the

exact predicted query execution time. On the

other hand, local schedulers should commit

themselves to execute the negotiated query by

i , =+ æ

L z

÷ ÷ ÷ ÷

ç ç ç ç

tbs

(3)

Data Warehousing Design and Advanced Engineering Applications

Search WWH ::

Custom Search

Home