Database Reference
In-Depth Information
from well-known players in library science (Talis 19 ) and database management
(Oracle 20 ), as well as other software companies (Open Link Software 21 ) have been
active in this space, and efficient management of large-scale datasets remains inte-
gral to the development of the Web of Data. We discuss in more detail how these
triple stores work in Chapter 7.
Another trend of more immediate interest to the GI professional is the opening
up of government data, including geographic data, in both the traditional sense of
using Web APIs and as linked RDF data. Examples of this are the United Kingdom's
data.gov.uk program, which has to date published over 5,400 datasets, from all
central government departments and a number of other public sector bodies and
local authorities, and the data.gov site in the United States, which has made over
6.4 billion RDF triples of U.S. government data available for exploitation. Further
sources, like the Europe-wide publicdata.eu and the Swedish opengov.se, have also
been published without direct government support. The aim is to foster transparency
in government by making the data more easily available and searchable, to allow
cross-referencing or “mashing up” of the data and semantic browsing or SPARQL
querying across datasets.
Recent developments in geosemantics have included suggestions to incorpo-
rate spatial logics such as Region Connection Calculus (Randell, Cui, and Cohn,
1992) for qualitative spatial representation and reasoning as a semantic technology
(Stocker and Sirin, 2009), as well as a proposal for GeoSPARQL (Perry and Herring,
2011). GeoSPARQL is a spatial extension to the SPARQL query language and allows
simple spatial queries to be formed using point, line, and polygon data types. It is
discussed further in Chapter 8.
As mentioned, we predict that the topic of data provenance will become increas-
ingly important as the number of RDF datasets published on the Web increases.
As witnessed by the current Web 1.0, it can be difficult to confirm the veracity or
accuracy of anything on the Web, and in the semantic sphere, this is even more the
case as context also plays a significant role: The data might have been correct or
useful for the purpose for which it was originally captured, but it can be harder to
answer the question of whether it is still valid when reused in a slightly different
context. There are technical difficulties in how to add information about provenance
or context to an RDF triple as metadata, which are discussed in Chapters 5 and 8,
but resolving these issues remains an important step in the development of the Web
of Data.
While it has frequently been described as orthogonal to the Semantic Web, the
final trend that we wish to mention here is the Social Web, or Web 2.0. Although often
depicted as being about social networks, the Web 2.0 trend is really about the Web
going back to its roots as the “Read-Write Web,” which has facilitated a huge increase
in user-generated content. There are several ways in which this user-generated con-
tent can be used on the Semantic Web. The most straightforward way is for the infor-
mation on user-generated Web pages like Wikipedia to be scrapped and published as
RDF (the RDF version of Wikipedia is called DBPedia 22 and contains both GI and
non-GI-related data). Other options are to exploit the semistructured information
available from user-authored tags such as the image and video tags on Flickr 23 and
such as the research carried out at France Telecom using natural language-processing
Search WWH ::




Custom Search