Publishing Linked Data - Linked Data: A Geographic Perspective

Database Reference

In-Depth Information

Other suggestions have been put forward for encoding provenance, for example,

the Open Provenance Model (Moreau et al., 2011), an OWL ontology 33 that allows

additional descriptions of provenance based on Agents (authors, publishers, etc.),

Processes (e.g., reslicing of data), and Artifacts (e.g., an RDF graph that was gen-

erated by a process). At the time of writing, there were several other vocabular-

ies that can be used to describe provenance, including the Changeset Vocabulary, 34

Provenance Vocabulary, 35 and Semantic Web Publishing Vocabulary. 36 A W3C

Provenance Interchange Working Group 37 is under way, tasked with providing map-

pings between these various provenance formats. As the technology is in a state of

flux, with no clear de facto standard, we just recommend that the provenance of

your GI Linked Data is specified using one of these vocabularies. For those hoping

to reuse your data, it is useful to include descriptions of who has written and pub-

lished your GI Linked Data and any limitations on the accuracy or frequency of your

surveys or other data-gathering techniques.

7.8 AUTHENTICATION AND TRUST

A word now on the various other aspects of data quality assessment, namely, authen-

tication and trust. Authentication contributes to the establishment of trust and

includes mechanisms such as verifying a URI, controlling access to a resource, or

using digital signatures, while trust is more of a social concept and remains harder

to mechanize.

The Named Graphs API for Jena 38 (NG4J) is one software library that can be used

to produce digital signatures for Linked Data and contribute to the authentication

process as it can be particularly helpful in verifying that the provenance metadata

does indeed belong to the Linked Dataset itself. The method NG4J uses to sign and

store the digital signature of a Named Graph is first to find its canonical representa-

tion, that is, a representation that specifies which nodes of the graph are adjacent to

which other nodes. Second, a digest of the canonical graph is calculated using any

common secure hash function (for example, SHA-1). The digest is represented as its

own named graph, which is called the Warrant Graph. In turn, the canonical repre-

sentation of the Warrant Graph is taken and signed with the data publisher's private

key using a standard signature algorithm like DSA or RSA. This signature is added

to the Warrant Graph, and the signed Warrant Graph can then be published. To check

whether a digital signature of a named graph is valid, the NG4J software will carry

out the following verification process: First, the digital signature is extracted from

the warrant graph of the named graph, along with the public key of the information

publisher. The public key is used to verify the signature of the Warrant Graph, that is,

to check that the signature does indeed belong to the information publisher. Second,

the canonical representation of the named graph is found and a digest created using

the SHA-1 hash function. This digest is compared against the digest in the warrant

graph, and if they are the same, then the named graph has a valid signature.

While provenance provides the input information to a trust measurement algo-

rithm, the degree of trust itself is the result of the question: Is this data good enough

to use? and is often based in part on who else thinks the data is good enough to use.

Linked Data: A Geographic Perspective

Search WWH ::

Custom Search

Home