Database Reference
In-Depth Information
be performed, while in the load-balancing configuration (LB for short), which
reflects our proposed system architecture, LMS connects to the Amazon LB
which redirects the connection to the next instance in the LB group.
As it can be understood, we are attempting to evaluate not only the new LMS
system but also the old one. In order to be fair in the comparison of the two
LMS systems, we regard that there are two scaling layers in the old system that
are obviously related to the content of cross-layer queries. This is the minimum
and meaningful configuration of the LMS system which can be used for the
comparison as in the minimalistic case where just one scaling layer is involved,
then the performance of the old LMS system will be either better (if many nodes
are involved) or equivalent (if just one node is involved in the layer) to the one
of the new LMS system configured in standalone mode.
We should also note that we have decided that it is not meaningful to com-
pare the performance of the two versions of the system in the distributed case.
This decision relied on the results of the standalone configuration where it was
apparent that the old system has a worse performance with respect to the new
one for cross-queries (which span different scaling layers). It is actually expected
that in the distributed case, again the new system will have an even better
performance than the old one.
Each experiment focused on a particular type of query (both are shown in
the paper's appendix):
- geospatial query: the first query focuses on providing particular information
(e.g., latitude, longitude, depth, magnitude and date) for all earthquakes that
have occurred in the region of the Crete island (main land and surrounding sea
with a particular radius). Apart from checking a huge amount of RDF data
(around 2 million triples), this query imposes a geospatial filter on the area
where the earthquake to be returned has occurred. Thus, apart from requiring
the existence of normal RDF indices, it also requires exploiting a geospatial
index. To this end, this query is quite demanding in processing effort and
when a particular query load is imposed on just one Virtuoso engine, it is
expected that the respective hosting VM will reach a high CPU load leading
to a continuous deterioration of query response time.
- complex normal query: the second query is more complex than the first as
it includes optional clauses while it also involves processing a bigger amount
of data with respect to the first query. In fact, the respective data set over
which the query is posed comprises of 253 millions triples which as an amount
is of course more than hundred times greater than the previous query one.
This query involves obtaining all lithostratigraphy analysis samples related to
various areas in Europe along with additional information, such as the drilling
depth, the elevation, the reliability of the sample as well as the minimum
and maximum depth involved in the respective sampling activity. While not
requiring a geospatial index, due to the huge amount of data that has to be
processed and the use of the optional clause, this query is expected to be more
demanding that the first both in terms of main memory and CPU load.
Search WWH ::




Custom Search