Database Reference
In-Depth Information
Another aspect of metadata comes into play when new data products are
being generated either as a result of analysis or simulation. These derived data
are now scientific products and need to be described with appropriate meta-
data information. In some disciplines such as astronomy (see Section 12.5.1),
the community has developed standard file formats that include metadata
information in the header of the file. These are often referred to as “self-
describing data formats,” because each file stored in such a format has all
the necessary metadata in its header. Software is then able to read this meta-
data and to generate new data products with the appropriate headers. One
of the diculties of this approach is to be able to automatically catalog the
derived data. In order to do that, some process needs to be able to read the
file headers and then extract the information and place it in a metadata cat-
alog. In terms of metadata management, astronomy seems to be ahead of
other disciplines, possibly because in addition to the astronomy community,
the discipline appeals to many amateurs. As a result, astronomers needed to
face the issue of data and metadata publication early on, making their data
broadly accessible. Other disciplines of science are still working on the de-
velopment of metadata standards and data formats. Without those, software
cannot generate descriptions of the derived data. Even when the standards
are formed within the community, there are often a number of legacy codes
that need to be retrofitted (or wrapped) to be able to generate the necessary
metadata descriptions as they generate new data products.
Finally, another perspective about metadata is whether a piece of meta-
data reflects an objective point of view about the resources that are being
described or only a subjective point of view about it. In the former case the
term metadata is usually used, but in the latter the more specific term anno-
tation is more commonly used. Annotations are normally produced manually
by humans and reflect the point of view of those humans with respect to the
objects being described. These annotations are also known as social anno-
tations , to reflect the fact that they can be provided by a large number of
individuals. They normally consist of sets of tags that are manually attached
to the resources being described, without a structured schema to be used for
this annotation or a controlled vocabulary to be used as a reference. These
types of annotations provide an additional point of view over existing data
and metadata, reflecting the common views of a community, which can be
extracted from the most common tags used to describe a resource. Flickr 4 or
del.icio.us 5 are examples of services used to generate this type of metadata
for images and bookmarks, respectively, and are being used in some cases in
scientific domains [3].
There are also other types of annotations that are present in the scientific
domain. For example, researchers in genetics annotate the Mouse Genome
Database 6 with information about the various genes, sequences, and pheno-
types. All annotations in the database are supported with experimental evi-
dence, and citations and are curated. The annotations also draw from a stan-
dard vocabulary (normally in the form of controlled vocabularies, thesauri,
Search WWH ::




Custom Search