Information Technology Reference
In-Depth Information
Publications
and Linked
Data
Data
Identification
and Linking
Published Data
Structured Data
Raw Data
FIGURE 2.2
Scientific data pyramid.
• Raw data collected from observations and from experiments (what
actually is done according to an initial research model or hypothesis).
• Structured data and data sets that went through data filtering and pro-
cessing (supporting some particular formal model, which is typically
refined from the initial model). These data are already stored in repos-
itories and may be shared with collaborative groups of researchers.
• Published data that support one or another scientific hypothesis,
research result, or statement. These data are typically linked to sci-
entific publications as supplemental materials; they may be located
on the publisher's platform or authors' institution platform and have
open or licensed access.
• Data linked and embedded into publications to support wide
research consolidation, integration, and openness.
Once the data are published, it is essential to allow other scientists to be
able to validate and reproduce the data in which they are interested and
possibly contribute new results. Capturing information about the processes
involved in transformation from raw data until the generation of published
data becomes an important aspect of scientific data management. Scientific
data provenance becomes an issue that also needs to be taken into consider-
ation by SDI providers [30].
Another aspect to take into consideration is to guarantee reusability of pub-
lished data within the scientific community. Understanding the semantics of
the published data becomes an important issue to allow for reusability; this
traditionally has been done manually. However, as we anticipate an unprec-
edented scale of published data that will be generated in big data science,
Search WWH ::




Custom Search