Information Technology Reference
In-Depth Information
Failure Probability Estimation
the last online session, the
TTF
is 680 minutes
(from 2010/11/23 3:10 to 2010/11/23 14:30).
With this simple periodical availability status
checking mechanism, the runtime
TTF
data are
gathered on the dispatcher. Thus, the
TTF
distri-
bution can be found at runtime. Suppose the
gathered
TTF
s are {
ttf
1
, ttf
2
, ttf
3,
..., ttf
n
}, where
n
is the number of gathered
TTF
is The failure prob-
ability
F(x)
of a worker (
x
is the time after a
worker went online) can be estimated as shown
in Equation (1):
Volunteer computing platforms have two kinds of
peers: dispatchers and workers. A task dispatcher
is a specific server that controls a volunteer com-
puting platform. Workers are volatile peers that
compute tasks and send back the task results to
the dispatcher. To estimate the failure probability
of each worker, runtime
TTF
data are required.
To gather such runtime data, a worker availability
status list is maintained by the dispatcher. The list
stores the start time of each worker. If a worker
is currently unavailable, it is marked as
offline
in
the list. The list is maintained as follows:
n
n
F x
( )
=
x
(1)
A Worker Goes Online
where
n
x
is the number of
TTF
is that are less than
or equal to
x
.
As shown in Figure 2(a), when a worker goes on-
line, it sends an online notification message to the
dispatcher. Once the notification is received, the
dispatcher updates the worker availability status
list as shown in Figure 2(b). The current time is
stored as the start time of this worker.
LEAST FAILURE PROBABILITY
DISPATCH POLICY
With the failure probability estimation, this paper
proposes a performance-oriented task dispatch
policy -
Least Failure Probability Dispatch
(LFPD) for volunteer computing platforms. The
assumptions are slightly different from the ones in
our previous work (Wang, 2007). While the previ-
ous work assumes a homogeneous environment,
this paper assumes that the volunteer computing
platform is a heterogeneous environment, in which
all the workers have different performances and
different bandwidths to the dispatcher.
Find Offline Worker
To gather the runtime
TTF
data, the dispatcher
also checks the availability status of workers pe-
riodically. As shown in Figure 3(a), the dispatcher
sends status checking messages to the workers
that are marked online in the worker availability
status list. Once the message is received by an
alive worker, the worker sends a reply message
back to the dispatcher as shown in Figure 3(b). If
a worker is offline, it cannot reply the checking
message. Then, it is marked as
offline
in the list.
As an example, before the periodical status check,
worker 4 in Figure 3 had been marked as
online
with a start time in the list, and then went offline.
Thus, it does not reply the checking message. The
dispatcher then updates the worker availability
status list, and marks worker 4 to be
offline
. It also
calculates the
TTF
of the worker 4's last online
session. Given the current time and start time of
An Enhanced Workflow
Management Mechanism
A workflow management mechanism has been
proposed in our previous work (Wang, 2007). It
is responsible for directing the workflow control
and the task information update. It cannot fully
satisfy the requirement of the LFPD, because it
Search WWH ::
Custom Search