Database Reference
In-Depth Information
system in order to reach good levels of query performance. Otherwise, we also
risk that the system might go down as it cannot anticipate reaching its limits
for a long time.
Both experiment results show that query time cannot be considered as a scal-
ing factor, especially as the respective behaviour exhibited by Virtuoso remains
at an acceptable, almost stable level. On the other hand, CPU can be considered
as a scaling factor which immediately indicates that the concerned VM has a
hard time in servicing the requests from the concurrent users. In fact, the VM's
CPU usage reaches quite high values which can be considered dangerous for the
health of the VM if they remain for a quite long time. Thus, it is necessary for
the system to scale and obtain more resources in order to even the incoming
load across all the resources reserved. The results show that a CPU threshold of
70 % can be safely considered as the one that can determine when to scale.
Someone can argue that such a limit is quite low with respect to the peak
values exhibited in the two experiments. However, we set this limit at a much
lower value in order to cater for cases where the splitting of query work does not
lead to a sharp decrease in CPU time which indicates the necessity of further
increasing the resources to be utilized. This has been checked through other
experiments with the rest of the queries which show that this threshold really
discriminates when Virtuoso has a hard time in servicing the user requests. These
other experiments, by assessing the performance of queries whose complexity lies
in between those of the two queries considered, have shown that indeed a similar
behaviour is observed which lies in between the one exhibited by the addressing
of the two queries considered. To this end, we have considered not showing
these experiment results in this article. Based on the above analysis, the CPU
threshold determination method can be considered as rather complete by taking
an exhaustive approach to guarantee that the choice made has been correct.
The main question would then be for how long to wait until to scale by
considering that the average CPU value constantly remains above the threshold
obtained. The experiments show that the checking period should be as mini-
mum as possible in order to ooad the current number of VMs for the current
load that is anticipated by them. To this end, it was decided that the checking
period should be 2 min so that we are confident that a temporal spike in load
is not experienced but a high load that is more or less constant. This period
length is appropriate to cater for both experiment cases where the higher need
of more instant reactiveness for the first experiment case is also covered (see
smaller response times for this case with respect to the second). By also consid-
ering that it takes time (some minutes) to create a new instance, the considered
period length seems appropriate. In case of a higher value we run into the dan-
ger of reserving more resources when it is already too late with respect to the
load incurred for the current instance. Again, this choice is guaranteed through
following an exhaustive approach both at real time as well as in extreme syn-
thetic cases for all types of queries issued by the respective applications. Thus,
it seems as the most appropriate solution for the current situation as well as
for forthcoming ones, once our system is exposed to an additional number of
end-user applications.
Search WWH ::




Custom Search