Linked Data and the Semantic Web - Linked Data: A Geographic Perspective

Database Reference

In-Depth Information

is a kind of URI, but a URI is a more general term that encompasses any item on

the Web. The other sort of URI is a URN, or Uniform Resource Name, which gives

the name of the resource without explaining how to locate or access it. For example,

http://www.ordnancesurvey.co.uk/oswebsite/partnerships/research/ is a URL, while

os:BuildingsAndPlaces/Hospital is a URN, and http://data.ordnancesurvey.co.uk/

doc/50kGazetteer/218013 is a URI for the data resource Southampton in Ordnance

Survey's RDF data.

The next layer in the stack of Fig u r e 2 .2 is XML, the eXtensible Markup Language,

which is a W3C standard for marking up or tagging data or documents. The follow-

ing is a very simple XML document:

<?xml version = “1.0” encoding = “UTF-8” ?>

<data>

<sentence lang = “en”> Here's a sentence. </sentence>

</data>

The tags are all enclosed in angle brackets <>, with a backslash to indicate the

end of that particular markup. lang is an attribute denoting which language is

used, in this case, English. The tags that are allowed in a particular XML file can

be specified in a Document Type Definition (DTD) or XML Schema Document

(XSD). This is a listing of which tag names, structures, and attributes are valid for a

particular document. In the example, the DTD might state that <sentence> can only

be used within the <data> tag. A DTD is an example of a schema , also known as

a grammar. A schema constrains the set of tags that can be used in the document,

which attributes can be applied to them, the order in which they appear, and the

allowed hierarchy of the tags.

RDF, the Resource Description Framework, which we have already mentioned, is

often said to be “serialized” in XML. This just means that RDF data uses the XML

tag structure for its markup and is based on a particular schema—in this case RDF

Schema (RDFS)—so that only a certain set of tags and ordering is permissible in an

RDF file.

At the next layer up sits the logic: ontologies, which can be described using OWL,

and rules, for which, among others, there is the RuleML family of languages (Boley

et al., 2011). Alongside these there is the query language SPARQL (Prud'hommeaux

and Seaborne, 2008) that allows SQL-like querying of RDF data (although not OWL

instances). The higher layers, of Unifying Logic, Proof, and Trust, are notably free

of acronyms; this is largely because they have been the least researched, and the

W3C (World Wide Web Consortium) has yet to standardize any languages or tools

to address the problems, although there is now a working group in the area of prove-

nance. It is the Trust layer that relates to provenance: explanations of why a particular

result has been returned as the answer to a query, where it has come from, and how

reliable it might be. There have been early discussions about “semantic spam,” where

incorrect markup is maliciously added to data, or erroneous links made, resulting in

incorrect answers to queries or misdirection of users' searches. Chapter 8, which dis-

cusses the publishing of information as Linked Data, explains in more detail some of

the strategies that spammers could take and what to watch for.

Linked Data: A Geographic Perspective

Search WWH ::

Custom Search

Home