Information Technology Reference
In-Depth Information
2.3.2 Research Communities and Specific SDI Requirements
A short overview of some research infrastructures and communities, par-
ticularly the ones defined for the ERA [21], allows a better understanding
of specific requirements for the future SDIs that are capable of addressing
big data challenges. Existing studies of European e-infrastructures analyzed
the scientific communities' practices and requirements; examples of these
studies are those undertaken by the SIENA Project [23], EIROforum Federated
Identity Management Workshop [24], European Grid Infrastructure (EGI)
Strategy Report [25], and UK Future Internet Strategy Group Report [26].
The high-energy physics (HEP) community represents a large number of
researchers, unique expensive instruments, and a huge amount of data that
are generated and need to be processed continuously. This community already
has the operational Worldwide LHC Computing Grid (WLCG) [11] infrastruc-
ture to manage and access data, protect their integrity, and support the whole
scientific data life cycle. WLCG development was an important step in the
evolution of European e-infrastructures that currently serve multiple scientific
communities in Europe and internationally. The EGI cooperation [27] man-
ages European and worldwide infrastructure for HEP and other communities.
Material science and analytical and low-energy physics (proton, neutron,
laser facilities) are characterized by short projects and experiments and con-
sequently a highly dynamic user community. A highly dynamic supporting
infrastructure and advanced data management infrastructure to allow wide
data access and distributed processing are needed.
The environmental and earth science community and projects target
regional or national and global problems. Huge amounts of data are collected
from land, sea, air, and space and require an ever-increasing amount of stor-
age and computing power. This SDI requires reliable fine-grained access
control to huge data sets, enforcement of regional issues, and policy-based
data filtering (data may contain national security-related information) while
tracking data use and maintaining data integrity.
Biological and medical sciences (also defined as life sciences) have a gen-
eral focus on health, drug development, new species identification, and new
instrument development. They generate a massive amount of data and new
demands for computing power, storage capacity, and network performance
for distributed processes, data sharing, and collaboration. Biomedical data
(health care, clinical case data) are privacy-sensitive data and must be han-
dled according to the European policy on processing of personal data [27].
Biodiversity research [17] involves research data and research specialists
from at least biology and environmental research and may include data about
climate, weather, and satellite observation. This primarily presents chal-
lenges for not only integrating different sources of information with different
data models and processing a huge amount of collected information but also
may require fast data processing in case of natural disasters. The projects
LifeWatch [28] and ENVRI (Common Operations of Environmental Research
Search WWH ::




Custom Search