Information Technology Reference
In-Depth Information
Failure Probability Estimation
the last online session, the TTF is 680 minutes
(from 2010/11/23 3:10 to 2010/11/23 14:30).
With this simple periodical availability status
checking mechanism, the runtime TTF data are
gathered on the dispatcher. Thus, the TTF distri-
bution can be found at runtime. Suppose the
gathered TTF s are { ttf 1 , ttf 2 , ttf 3, ..., ttf n }, where n
is the number of gathered TTF is The failure prob-
ability F(x) of a worker ( x is the time after a
worker went online) can be estimated as shown
in Equation (1):
Volunteer computing platforms have two kinds of
peers: dispatchers and workers. A task dispatcher
is a specific server that controls a volunteer com-
puting platform. Workers are volatile peers that
compute tasks and send back the task results to
the dispatcher. To estimate the failure probability
of each worker, runtime TTF data are required.
To gather such runtime data, a worker availability
status list is maintained by the dispatcher. The list
stores the start time of each worker. If a worker
is currently unavailable, it is marked as offline in
the list. The list is maintained as follows:
n
n
F x
( )
=
x
(1)
A Worker Goes Online
where n x is the number of TTF is that are less than
or equal to x .
As shown in Figure 2(a), when a worker goes on-
line, it sends an online notification message to the
dispatcher. Once the notification is received, the
dispatcher updates the worker availability status
list as shown in Figure 2(b). The current time is
stored as the start time of this worker.
LEAST FAILURE PROBABILITY
DISPATCH POLICY
With the failure probability estimation, this paper
proposes a performance-oriented task dispatch
policy - Least Failure Probability Dispatch
(LFPD) for volunteer computing platforms. The
assumptions are slightly different from the ones in
our previous work (Wang, 2007). While the previ-
ous work assumes a homogeneous environment,
this paper assumes that the volunteer computing
platform is a heterogeneous environment, in which
all the workers have different performances and
different bandwidths to the dispatcher.
Find Offline Worker
To gather the runtime TTF data, the dispatcher
also checks the availability status of workers pe-
riodically. As shown in Figure 3(a), the dispatcher
sends status checking messages to the workers
that are marked online in the worker availability
status list. Once the message is received by an
alive worker, the worker sends a reply message
back to the dispatcher as shown in Figure 3(b). If
a worker is offline, it cannot reply the checking
message. Then, it is marked as offline in the list.
As an example, before the periodical status check,
worker 4 in Figure 3 had been marked as online
with a start time in the list, and then went offline.
Thus, it does not reply the checking message. The
dispatcher then updates the worker availability
status list, and marks worker 4 to be offline . It also
calculates the TTF of the worker 4's last online
session. Given the current time and start time of
An Enhanced Workflow
Management Mechanism
A workflow management mechanism has been
proposed in our previous work (Wang, 2007). It
is responsible for directing the workflow control
and the task information update. It cannot fully
satisfy the requirement of the LFPD, because it
Search WWH ::




Custom Search