Database Reference
In-Depth Information
is a kind of URI, but a URI is a more general term that encompasses any item on
the Web. The other sort of URI is a URN, or Uniform Resource Name, which gives
the name of the resource without explaining how to locate or access it. For example,
http://www.ordnancesurvey.co.uk/oswebsite/partnerships/research/ is a URL, while
os:BuildingsAndPlaces/Hospital is a URN, and http://data.ordnancesurvey.co.uk/
doc/50kGazetteer/218013 is a URI for the data resource Southampton in Ordnance
Survey's RDF data.
The next layer in the stack of Fig u r e  2 .2 is XML, the eXtensible Markup Language,
which is a W3C standard for marking up or tagging data or documents. The follow-
ing is a very simple XML document:
<?xml version = “1.0” encoding = “UTF-8” ?>
<data>
<sentence lang = “en”> Here's a sentence. </sentence>
</data>
The tags are all enclosed in angle brackets <>, with a backslash to indicate the
end of that particular markup. lang is an attribute denoting which language is
used, in this case, English. The tags that are allowed in a particular XML file can
be specified in a Document Type Definition (DTD) or XML Schema Document
(XSD). This is a listing of which tag names, structures, and attributes are valid for a
particular document. In the example, the DTD might state that <sentence> can only
be used within the <data> tag. A DTD is an example of a schema , also known as
a grammar. A schema constrains the set of tags that can be used in the document,
which attributes can be applied to them, the order in which they appear, and the
allowed hierarchy of the tags.
RDF, the Resource Description Framework, which we have already mentioned, is
often said to be “serialized” in XML. This just means that RDF data uses the XML
tag structure for its markup and is based on a particular schema—in this case RDF
Schema (RDFS)—so that only a certain set of tags and ordering is permissible in an
RDF file.
At the next layer up sits the logic: ontologies, which can be described using OWL,
and rules, for which, among others, there is the RuleML family of languages (Boley
et al., 2011). Alongside these there is the query language SPARQL (Prud'hommeaux
and Seaborne, 2008) that allows SQL-like querying of RDF data (although not OWL
instances). The higher layers, of Unifying Logic, Proof, and Trust, are notably free
of acronyms; this is largely because they have been the least researched, and the
W3C (World Wide Web Consortium) has yet to standardize any languages or tools
to address the problems, although there is now a working group in the area of prove-
nance. It is the Trust layer that relates to provenance: explanations of why a particular
result has been returned as the answer to a query, where it has come from, and how
reliable it might be. There have been early discussions about “semantic spam,” where
incorrect markup is maliciously added to data, or erroneous links made, resulting in
incorrect answers to queries or misdirection of users' searches. Chapter 8, which dis-
cusses the publishing of information as Linked Data, explains in more detail some of
the strategies that spammers could take and what to watch for.
Search WWH ::




Custom Search