Database Reference
In-Depth Information
8.6
Parallel R for High-Performance Analytics: Applications
to Biology
............................................................
306
8.6.1
Advanced Analytics for High-Throughput
Quantitative Proteomics
.....................................
308
8.6.1.1
Parallel Processing of Core Analysis Steps
in ProRata
...........................................
310
8.6.1.2
Estimation of Peptide Abundance Ratios
and Scoring of their Variability and Bias
...........
312
8.6.1.3
Protein Abundance Ratio Estimation with
Confidence Interval Evaluation
......................
313
8.7
Summary
.............................................................
315
Acknowledgment
...........................................................
316
References
.................................................................
316
8.1 Introduction
The analysis of data is a key part of any scientific endeavor, as it leads to a
better understanding of the world around us. With scientific data now being
measured in terabytes and petabytes, this analysis is becoming quite chal-
lenging. In addition, the complexity of the data is increasing as well due to
several factors such as improved sensor technologies and increased computing
power. This complexity can take various forms such as multisensor, multi-
spectral, multiresolution data, spatio-temporal data, high-dimensional data,
structured and unstructured mesh data from simulations, data contaminated
with different types of noise, three-dimensional data, and so on.
Over the last decade, techniques from machine learning, image and sig-
nal processing, and high-performance computing have gained acceptance as
viable approaches for finding useful information in science data. These tech-
niques complement the more established approaches from statistics and pat-
tern recognition to provide solutions to a diverse set of problems in a variety
of application domains.
This chapter is organized as follows: first, we describe a typical data flow
diagram used in scientific data analysis. It shows how one might start with
various forms of scientific data and process them iteratively to extract useful
information. This is followed by several specific examples of how this general
process is applied to problems in domains ranging from materials science to
biology and cheminformatics. Finally, we conclude with a brief summary of
some challenges in the analysis of scientific datasets.
Search WWH ::




Custom Search