Information Technology Reference
In-Depth Information
attaching a clear data semantic becomes a necessary condition for efficient
reuse of published data. Learning from best practices in the semantic web
community on how to provide reusable published data will be one consider-
ation that will be addressed by SDI.
Big data are typically distributed both on the collection side and on the pro-
cessing/access side: Data need to be collected (sometimes in a time-sensitive
way or with other environmental attributes), distributed, or replicated.
Linking distributed data is one of the problems to be addressed by SDI.
The European Commission's initiative to support open access to scientific
data from publicly funded projects suggests introduction of the following
mechanisms to allow linking publications and data [31]:
• PID: persistent data ID [32]
• ORCID: Open Researcher and Contributor Identifier [33].
2.4.2 Data Life Cycle Management in Scientific Research
e-Science enabled by computers and information technology (IT) allows
multipurpose data collection and use and advanced data processing. A pos-
sibility to store the initial data sets and all intermediate results will allow for
future data use, in particular data repurposing and secondary research, as
the technology and scientific methods develop.
Emergence of computer-aided research methods is transforming the way
research is performed and scientific data are processed or used. This is also
reflected in the changed SDLM shown in Figure 2.3 and discussed next.
We refer to the extensive study of the SDLM models [34]. The traditional
scientific data life cycle includes a number of stages (see Figure 2.3a):
• Research project or experiment planning
• Data collection
• Data integration and processing
• Research result publication
• Discussion, feedback
• Archiving (or discarding)
The new SDLM model requires data storage and preservation at all stages,
which should allow data reuse or repurposing and secondary research on
the processed data and published results. However, this is possible only if
the full data identification, cross-reference, and linkage are implemented
in the SDI. Data integrity, access control, and accountability must be sup-
ported during the entire data life cycle. Data curation is an important
component of the discussed SDLM and must also be done in a secure and
trustworthy way.
Search WWH ::




Custom Search