Database Reference
In-Depth Information
from existing annotations, since they are expressed in formal languages with
a clear logical theory behind.
Not all ontologies have the same degree of formality; neither do they include
all the components that could be expressed with formal languages, such as
concept taxonomies, formal axioms, disjoint and exhaustive decompositions
of concepts, and so forth. Given this fact, ontologies are usually classified
either as lightweight or heavyweight. 10 An example of the former would be
Dublin Core, 23 which is being widely used to specify simple characteristics of
electronic resources, specifying a predefined set of features such as creator ,
date , contributor , description , format , and the like. Examples of the latter
would be the ontologies used for workflow annotation in the myGrid project
or for product description in the aforementioned satellite imaging application.
Lightweight ontologies can be specified in simpler formal ontology languages
like RDF Schema, 14
and heavyweight ontologies require more complex lan-
guages like OWL.
12.3 Provenance
Provenance is commonly defined as the origin or source or history of deriva-
tion of some object. In the context of art, this term carries a more con-
crete meaning: It denotes the record of ownership of an art object. In this
context, such concrete records allow scholars or collectors to verify and as-
certain the origin of the work of art, its authenticity, and therefore its
price.
This notion of provenance can be transposed to electronic data. 15 If the
provenance of data produced by computer systems could be determined as
it can for some works of art, then users would be able to understand how
documents were assembled, how simulation results were determined, or how
analyses were carried out. For scientists, provenance of scientific results would
indicate how results were derived, what parameters influenced the derivation,
what datasets were used as input to the experiment, and so forth. In other
words, provenance of scientific results would help reproducibility , 16
a funda-
mental tenet of the scientific method.
Hence, in the context of computer systems, we define provenance of a data
product as the process that led to such a data product, where process en-
compasses all the derivations, datasets, parameters, software and hardware
components, computational processes, and digital or nondigital artifacts that
were involved in deriving and influencing the data product.
Conceptually, such provenance could be extremely large, since potentially it
could bring us back to the origin of time. In practice, such level of information
Search WWH ::




Custom Search