Information Technology Reference
In-Depth Information
are intermediate temporal violations rather than final violations which are beyond
recovery. The work in [53] proposes three alternate courses of recovery action being
no action (NIL), rollback (RBK) and compensation (COM). NIL, which entirely
counts on the automatic recovery of the system itself, is normally not considered
'risk-free'. As for RBK, unlike handling conventional system function failures, it
normally causes extra delays and makes the current temporal violations even worse.
In contrast, COM, namely time deficit compensation, is a suitable approach for han-
dling temporal violations. The work in [5] proposes a time deficit allocation (TDA)
strategy which compensates current time deficits by utilizing the expected time re-
dundancy of subsequent activities. However, since the time deficit is not truly reduced
by TDA, this strategy can only postpone the violations of local constraints on some
local workflow segments, but has no effectiveness on the overall deadlines. There-
fore, we need to investigate those strategies which can indeed reduce the time deficits.
Besides many others, one of the compensation processes which is often employed and
can actually make up the time deficit is workflow rescheduling.
Workflow rescheduling, such as local rescheduling (which deals with the mapping
of underling resources to workflow activities within specific local workflow seg-
ments), is normally triggered by the violation of QoS constraints [13, 61]. Workflow
scheduling as well as workflow rescheduling are classical NP-complete problems [11,
16]. Therefore, many heuristic and metaheuristic algorithms are proposed. The work
in [60] has presented a systematic overview of workflow scheduling algorithms for
scientific grid computing. The work in [12] proposes an ACO (Ant Colony Optimiza-
tion) approach to address scientific workflow scheduling problems with various QoS
requirements such as reliability constraints, makespan constraints and cost constraints.
For handling temporal violations in scientific cloud workflow systems, both time and
cost need to be considered while time has a priority over cost since we focus more on
reducing the time deficits during the compensation process. An ACO based local
workflow rescheduling strategy is proposed by us in [30] for handling temporal viola-
tions in scientific workflows.
Another issue about violation handling is that some violations may be beyond the
handling power of certain violation handling strategies. Therefore, considering the
handling capability, statistically recoverable and non-recoverable temporal violations
are defined in [23] so that they can be handled properly with different violation han-
dling strategies. While statistically recoverable temporal violations can usually be
recovered by workflow rescheduling strategies, statistically non-recoverable temporal
violations can only be recovered by heavy-weight (i.e. more expensive) solutions such
as resource recruitment, stop and restart, processor swapping and workflow restruc-
ture [26]. The state-of-the-art work on a general temporal violation handling
framework is proposed in [31] where K levels of temporal violations can be defined
according to the handling capability of K available violation handling strategies. In
such a case, a temporal violation can be handled by the violation handling strategy
which has enough capability but with the least cost. Therefore, the total violation
handling cost can be minimized.
In addition, to reduce the increasing violation handling cost, the work in [28]
proposed a novel concept of “violation handling point selection” which is to further
Search WWH ::

Custom Search