Scientific Data Analysis - Scientific Data Management

Database Reference

In-Depth Information

8.6

Parallel R for High-Performance Analytics: Applications

to Biology

............................................................

306

8.6.1

Advanced Analytics for High-Throughput

Quantitative Proteomics

.....................................

308

8.6.1.1

Parallel Processing of Core Analysis Steps

in ProRata

...........................................

310

8.6.1.2

Estimation of Peptide Abundance Ratios

and Scoring of their Variability and Bias

...........

312

8.6.1.3

Protein Abundance Ratio Estimation with

Confidence Interval Evaluation

......................

313

8.7

Summary

.............................................................

315

Acknowledgment

...........................................................

316

References

.................................................................

316

8.1 Introduction

The analysis of data is a key part of any scientific endeavor, as it leads to a

better understanding of the world around us. With scientific data now being

measured in terabytes and petabytes, this analysis is becoming quite chal-

lenging. In addition, the complexity of the data is increasing as well due to

several factors such as improved sensor technologies and increased computing

power. This complexity can take various forms such as multisensor, multi-

spectral, multiresolution data, spatio-temporal data, high-dimensional data,

structured and unstructured mesh data from simulations, data contaminated

with different types of noise, three-dimensional data, and so on.

Over the last decade, techniques from machine learning, image and sig-

nal processing, and high-performance computing have gained acceptance as

viable approaches for finding useful information in science data. These tech-

niques complement the more established approaches from statistics and pat-

tern recognition to provide solutions to a diverse set of problems in a variety

of application domains.

This chapter is organized as follows: first, we describe a typical data flow

diagram used in scientific data analysis. It shows how one might start with

various forms of scientific data and process them iteratively to extract useful

information. This is followed by several specific examples of how this general

process is applied to problems in domains ranging from materials science to

biology and cheminformatics. Finally, we conclude with a brief summary of

some challenges in the analysis of scientific datasets.

Scientific Data Management

Search WWH ::

Custom Search

Home