Information Technology Reference
In-Depth Information
Our contribution : to the best of our knowledge, this is the first paper dealing
with the problem of cloud infrastructure autoscaling using spot instances in
the context of scientific workflows. Next section describes a novel autoscaling
strategy designed for achieving superior cost-performance on the execution of
scientific workflows. The strategy proposed in this work differentiates from others
on the following features:
1. it exploits spot instances for achieving an overallbetter cost-performanceaim-
ing the reduction of time and/or cost of scientific workflow executions, and
2. it uses an heuristic method for workflow makespan optimization, which
schedules critical tasks intelligently minimizing the effect of failures in the
overall running time.
3 Spot Instances Aware Autoscaling
The aim of the Spot Instances Aware Autoscaling (SIAA) strategy proposed
in this work is to achieve a better cost-performance of scientific workflows on
the cloud. This is attained, first, by acquiring an infrastructure comprising on-
demand and spot instances according to the computation requirements for the
next hour. And, second, by minimizing the overall makespan and reducing the
probability of task failures due to out-of-bid errors.
The strategy performs the autoscaling process on an hourly basis through a
sequence of 4 phases, namely: ( i ) information update, ( ii ) infrastructure scaling
( iii ) heuristic tasks scheduling, and ( iv ) shutdown idle instances. The purpose of
each of these phases is explained through the following subsections. The execu-
tion interval SIAA was set to 1 hour as in alternative autoscaling strategies [9].
3.1 Phase 1: Information Update
Every time SIAA is invoked it updates the workflow execution information.
This phase of the algorithm is fundamental because ( i )itpermitsadynamic
adaptation of the strategy to changes in the infrastructure, and ( ii ) it reduces
the adverse effects of errors in performance and bid price prediction methods. In
other words, having updated information allows a more accurate decision making
process on the following phases of autoscaling.
When SIAA is invoked it updates the state of the instances and predicts the
remaining execution time for already running tasks. For waiting tasks it
updates duration , (earliest and late) start times , and identifies which of those
tasks are critical for executing the workflow in minimum time.
Task durations the duration d t of a task t can be estimated in practice using
some of the existent performance prediction mechanism [10]. For the pur-
pose of our experiments, task durations are estimated using a linear model
relating the task's size and the instance's performance, plus the addition of
an uniformly distributed error. Durations are estimated considering the pre-
ferred instance type for each task. The preferred instance type for a waiting
 
Search WWH ::




Custom Search