Database Reference
In-Depth Information
However, before sketching and then realizing the new architecture, it had to
be decided how to deal with the issue of updating as this was creating trac to
the system if all Virtuoso servers involved in the instances had to be updated.
Furthermore, the problem of updating new instances to the most up-to-date RDF
content had to be resolved. To this end, it was decided: (a) to directly update
a Virtuoso server only in one instance, from now on called the master instance ,
via LMS and propagate the changes only to the current instances running, from
now on called slave instances , in the load balancing component (which are of
course less than those used in the previous architecture) and (b) lazy update the
image used to create new slave instances, from now on called slave image ,ina
timely fashion (e.g., every half an hour) and only when updates have previously
occurred after the previous image updating. While the first decision does not
totally remedy the first problem, we followed it by having in mind the fact that
the current (master and slave) instances should be up-to-date with respect to the
RDF content while new slave instances can be allowed to be a little bit out of date
as this does not jeopardize the proper functioning of the applications supported
by the system. Such lazy updating was rather a necessity by considering the fact
that image updating can take minutes and is costly so it cannot be performed
each time a single update is performed in the system.
The above decisions had to be properly backed up by the respective tech-
nologies exploited. On one hand, the free and latest version of Virtuoso does not
allow the updating of many Virtuoso servers that might form a certain cluster in
an automated way. Such an updating is a proprietary feature of all Virtuoso ver-
sions. To this end, we proceeded in developing our own mechanism for updating
the current running Virtuoso servers by exploiting the underlying SQL function-
ality of Virtuoso. In the first place, we created triggers on the master instance
that were used when the main RDF table of the respective Virtuoso server was
updated (i.e., the one named RDF QUAD ) to update the Virtuoso servers in
the (running) slave instances. However, this ended up becoming quite slow as
the update was finished only when all Virtuoso servers were updated. To solve
this, we decided to follow a log-based approach where the triggers write into a
specific file what is updated (in the form of actual SQL statements) and then a
Java program consumes the entries of this log file and is responsible for updating
the remaining Virtuoso servers. This component, which is named as Updater ,is
also responsible for updating the slave image only every half an hour and only
when an update has occurred after the last slave image updating. It exploits the
Amazon Web Services SDK for java ( http://aws.amazon.com/documentation/
sdk-for-java/ ) to find out the IPs of the remaining Virtuoso instances as well
as perform the slave image updating. Through this solution, the LD updating
ends when the Virtuoso instance receiving the update request finishes processing
it; the Virtuoso servers of the slave instances are updated subsequently via the
Updater. As such, there is no actual delay in performing LD updating and we
allow for a small inconsistency until LD updating is propagated to the remaining
Virtuoso servers which, as already stated, is acceptable.
On the other hand, the (basic) load balancer (LB) offered by the currently
exploited cloud (Amazon EC2) does not offer the capability to route an update
Search WWH ::




Custom Search