Database Reference
In-Depth Information
his/her original data and have it published in the system (e.g., for privacy reasons
does not desire to publish all of his/her data); (b) the provider has relational data
and desires to import them into the system. An R2RML specification is then
defined and fed to the system which initiates the unidirectional relational-to-
LD mapping process that also caters for keeping the two different forms of data
synchronized. This automated publishing way is appropriate when the provider's
data are in relational form and the provider has the capability to define an
R2RML specification. The data provider can control which data are transformed
and imported via the R2RML specification; (c) the provider has XML-based data
and needs to transform and store them in the system. This publishing way is
similar to the previous one with the following exceptions: not relational but XML
data are concerned, the XML data need to be provided inline in the respective
method request and the synchronization is not fully automated as new data need
to indirectly imported into the system by calling again the addXSLMappings
method.
5 Linked Data Management Architecture
5.1 Previous Architecture Drawbacks and Current Solutions
While the previous architecture is able to address well the need of storing a huge
amount of data as well as of performing load balancing in order to guarantee
a certain level of LD query/export performance, it suffers from the following
drawbacks: (a) it is quite costly as it includes many load balancing components
and even more image instances, (b) the query performance was not adequate
in the case of queries not targeting a particular RDF graph, as query results
from all scaling layers had to be collected and joined before being returned to
the user and (c) updating was performed across all instances of a particular
scaling layer, thus creating increased trac in the system as well as increasing
the update execution time (which could also deteriorate query performance in
cases or domains where the update frequency is higher).
To resolve the above drawbacks, it was decided to rely on a more simplified
architecture which is less costly and draws additional resources only when really
needed. On the other hand, such an architecture provides the necessary sophisti-
cation to adequately handle the challenges of distributed operations. This deci-
sion also relied on the current and forthcoming patterns of system usage where it
is expected that the majority of user requests will require querying and exporting
functionality and not updating one. In fact, in all of the applications currently
supported by the system, the updating was performed sparsely and only in some
cases a little bit more frequent in terms of a few times per day (e.g., consider
that a set of earthquakes occurring at the same day in a certain country cannot
lead to a frequent and enormous updating of the data stored in the system).
By also considering that through testing it was observed that the performance
of Virtuoso was stable even for an increasing and huge amount of LD stored, it
was then decided that there was no need for partitioning the data into different
RDF Stores scattered in different instances.
Search WWH ::




Custom Search