Information Technology Reference
In-Depth Information
LAW: Link-AWare Source Selection
for Virtually Integrating Linked Data
Xuejin Li 1 , Zhendong Niu 1 , Chunxia Zhang 2 , and Xiaoyang Wang 1
1 School of Computer Science, Beijing Institute of Technology
xuejinli7@gmail.com
zniu@bit.edu.cn
qduwxy@126.com
2 School of Software, Beijing Institute of Technology
cxzhang@bit.edu.cn
Abstract. With the wide adoption of linked data principles, a large
amount of structural data have emerged on World Wide Web. These
data are interlinked and form a Web of Data. Yet, so far, only little at-
tention has been paid to the effect of links on federated querying. This
work presents LAW, a novel link-aware approach for federated SPARQL
queries over the Web of Data. The source selection module (called LAWS)
of LAW can be directly combined with existing federated query en-
gines in order to achieve the same query recall values while querying
fewer datasets. We extend three well-known federated query engines with
LAWS and compare our extensions with the original approaches. The
comparison shows that LAWS can greatly reduce the number of queries
sent to the endpoints, while keeping high query recall values. Therefore, it
can significantly improve the performance of federated query processing
engines. We also have implemented LAW as an independent system. A
wide experimental study shows that LAW has higher performance than
state-of-the-art federated query systems.
Keywords: federated query processing, SPARQL, Web of Data.
1 Introduction
With the wide adoption of linked data principles, the World Wide Web has
evolved from a global information space of linked documents to one where both
documents and data are linked [2]. A large amount of structural data on the Web
enable new types of applications which can aggregate data from different data
sources and integrate fragmentary information from multiple sources to achieve a
more complete view. Answering queries across multiple distributed Linked Data
sources is a key challenge for developing this kind of applications.
Federated querying over the distributed data sources is called virtual data
integration . User queries are decomposed into several sub-queries that are dis-
tributed to autonomous data sources which execute these sub-queries and return
the results which are integrated locally. There are a high number of links in the
Search WWH ::




Custom Search