Information Technology Reference
In-Depth Information
the index service and gets the location of the producer; the failure detector
is a producer that gives the status of the monitored objects.
4.5.3.2
Adaptive Model
A grid fault-detection service should stress two things: i rst, how to satisfy
the QoS between two processes; and second, how to satisfy the grid
dynamic nature. The QoS between the monitored process and the detec-
tor can be donated in a tuple
(
T
U
D
,
T
L
M
)
, where
T
U
D
is an upper bound on
MR
,
T
U
the detection time,
T
L
MR
is a lower bound on the average mistake recur-
M
is the upper bound on the average mistake duration. The
QoS requirement can be expressed as in Equation 4.1:
rence time,
T
U
T
D
£
T
U
D
,
T
MR
≥
T
L
T
M
£
T
U
M
MR
,
(4.1)
It is clear that the heartbeat interval Δ
interval
is a very important factor that
contributes to the detection time:
T
U
D
≥ Δ
interval
+ Δ
tr
(4.2)
where Δ
tr
is a safety margin. A recursive method has been used to get
Δ
interval
, as depicted in algorithm 1, which is adopted from [56,58]. After
getting Δ
interval
, a sliding window algorithm is taken, and this sliding
method records the message behaviors, thus the Δ
interval
will change adap-
tively to the system conditions. Another problem for failure detection is
when to suspect failure. The details are presented in Algorithm 4.1.
Algorithm 4.1
Assumption: The inter-arrival of the “I'm alive” message follows a
Gaussian distribution. The parameters of the distribution are estimated
from a sample window. The probability of a given message arriving more
that
t
time unit is given as
+•
1
2
2
sp
Ú
- m
[(
x
) /2
s
]
pt
()
=
e
dx
(4.3)
2
t
Step 1:
[i nd Δ
intervalmax
]
Compute
U
2
(1
-
PT
T
)( )
()
g=
L
D
U
(4.4)
2
D+
tr
D
UU
D
=
max(
g
TT
,
)
(4.5)
intervalmax
M
D
Search WWH ::
Custom Search