Information Technology Reference
In-Depth Information
Summary
This chapter discusses the challenges that are imposed by big data on the
modern and future e-scientific data infrastructure (SDI). The chapter dis-
cusses the nature and definition of big data, including such characteristics
as volume, velocity, variety, value, and veracity. The chapter refers to dif-
ferent scientific communities to define requirements on data management,
access control, and security. The chapter introduces the scientific data life
cycle management (SDLM) model, which includes all the major stages and
reflects specifics in data management in modern e-science. The chapter
proposes the generic SDI architectural model that provides a basis for build-
ing interoperable data or project-centric SDI using modern technologies and
best practices. The chapter discusses how the proposed models SDLM and
SDI can be naturally implemented using modern cloud-based infrastructure
services and analyses security and trust issues in cloud-based infrastructure
and summarizes requirements to access control and access control infra-
structure that should allow secure and trusted operation and use of the SDI.
2.1 Introduction
The emergence of data-intensive science is a result of modern science comput-
erization and an increasing range of observations, experimental data collected
from specialist scientific instruments, sensors, and simulation in every field
of science. Modern science requires wide and cross-border research collabo-
ration. The e-science scientific data infrastructure (SDI) needs to provide an
environment capable of both dealing with the ever-increasing heterogeneous
data production and providing a trusted collaborative environment for dis-
tributed groups of researchers and scientists. In addition, SDI needs on the
one hand to provide access to existing scientific information, including that
in libraries, journals, data sets, and specialist scientific databases and on the
other hand to provide linking between experimental data and publications.
Industry is also experiencing wide and deep technology refactoring to
become data intensive and data powered. Cross-fertilization between emerg-
ing data-intensive/-driven e-science and industry will bring new data-intensive
technologies that will drive new data-intensive/-powered applications.
Further successful technology development will require the definition of
the SDI and overall architecture framework of data-intensive science. This
will provide a common vocabulary and allow concise technology evaluation
and planning for specific applications and collaborative projects or groups.
Big data technologies are becoming a current focus and a new “buzzword”
both in science and in industry. Emergence of big data or data-centric
 
Search WWH ::




Custom Search