Database Reference
In-Depth Information
The basic idea is to define a point in time p so that tuples of bucket b earlier than p
are processed by the old owner and tuples later than p are processed by the new one.
This is straightforward for stateless operators. However, it is more challenging when
reconfiguring stateful operators. Due to the sliding window semantics used in state-
ful operators, a tuple might contribute to several windows. Thus, there will be tuples
that need to be processed by both the old and new owners. SC reconfigures a sub-
cluster by triggering one or more reconfiguration actions. Each action changes the
ownership of a bucket from the old owner to the new one within the same subcluster.
Reconfiguration actions only affect the instances of the subcluster being reconfig-
ured and the upstream LBs.
Elasticity rules are specified as thresholds that set the conditions that trigger pro-
visioning, decommissioning or load balancing. Provisioning and decommissioning
are triggered if the average CPU utilization is above the upper-utilization-threshold
(UUT) or below the lower-utilization-threshold (LUT). Reconfiguration actions aim
to achieve an average CPU utilization figure that is as close as possible to the target-
utilization-threshold (TUT). Load balancing is triggered when the standard deviation
of the CPU utilization is above the upper-imbalance-threshold (UIT). A minimum-
improvement-threshold (MIT) specifies the minimum performance improvement to
endorse a new configuration. That is, the new configuration is applied only if the
imbalance reduction is above the MIT. The goal is to keep the average CPU uti-
lization within upper and lower utilization thresholds and the standard deviation
below the upper imbalance threshold in each subcluster. SC features a load-aware
provisioning strategy. When provisioning instances, a naive strategy would be to
provision one instance at a time (individual provisioning). However, individual pro-
visioning might lead to cascaded provisioning, that is, continuous allocation of new
instances. This might happen with steadily increasing loads when the additional
computing power provided by the new instance does not decrease the average CPU
utilization below UUT. To overcome this problem, SC load-aware provisioning takes
into account the current subcluster size and load to decide how many new instances
to provide in order to reach for TUT.
12.7 STOR MY
The Stormy system [18] has been presented as a distributed stream processing ser-
vice for continuous data processing that relies on techniques from existing cloud
storage systems that are adapted to efficiently execute streaming workloads. It
uses a synthesis of well-known techniques from cloud storage systems [9,15,17] to
achieve the scalability and availability goals. It uses distributed hash tables (DHT)
[5] to distribute queries across all nodes, and route events from query to query
according to the query graph. To achieve high availability, Stormy uses a repli-
cation mechanism where queries are replicated on several nodes, and events are
concurrently executed on all replicas. As Stormy's architecture is by design decen-
tralized, there is no single point-of-failure. However, Stormy uses the assumption
that a query can be completely executed on one node. Thus, there is an upper limit
on the number of incoming events of a stream. The API of Stormy consists of only
four functions:
Search WWH ::




Custom Search