China Grid and Related Dependability Research - Grid Computing: Infrastructure, Service, and Applications

Information Technology Reference

In-Depth Information

Assumption 2

1. The probability the task fails at state t i is l (or failure rate), and

follows a Poisson distribution. Mean time to failure (MTTF) is

mathematically dei ned by 1/l.

2. The task fails at state t i , and the time when a failure occurs follows

a Poisson distribution.

3. The failure execution time of the task is T F .

4. The average down time of the system is T D .

Assumption 3

1. The number of replicas of a task is N R for replication policy or

replication with checkpointing policy.

2. The overhead of checkpoint ( T C ) is the same for different policies.

3. The recovery time of checkpointing ( T R ) is the same for different

policies.

4. The checkpoint interval is T I .

With these dei nitions and assumptions, a brief explanation of QoS crite-

ria for different failure-recovery policies is presented.

Retrying

The computation of the execution time with retrying was i rst specii ed by

Duda [71].

l

T

l

T

ee

(

-

1)

D

F

Q

=

(4.17)

et ret

()

l

The execution cost for retrying is computed as

Q co = Q price ¥ Q et(ret)

(4.18)

Checkpointing

The execution time with checkpointing is computed from Equation 4.19

and adopted from [72].

Êˆ

l-+

(

TTT

)

l+

(

TT

)

T

e

(

e

-

1)

DCR

I

C

(4.19)

Q

= Á˜

F

et ch

()

Ë¯

T

l

I

The execution cost for checkpointing is as

Q co = Q price ¥ Q et(ch)

(4.20)

Grid Computing: Infrastructure, Service, and Applications

Search WWH ::

Custom Search

Home