Squeezing big data into a small organisation - Open Source Software in Life Science Research

Biomedical Engineering Reference

In-Depth Information

11

Squeezing big data into a small

organisation

Michael A. Burrell and Daniel MacLean

Abstract: The technological complexity involved in generating and

analysing high-throughput biomedical data sets means that we need

new tools and practices to enable us to manage and analyse our

data. In this chapter we provide a case study in setting up a

bioinformatics support service using free and open source tools and

software for a small research institute of approximately 80 scientists.

As far as possible our support service tries to empower the scientists

to do their own analyses and we describe the tools and systems we

have found useful and the problems and pitfalls we have encountered

while doing so.

Key words: genomics; NGS; institute; bioinformatics; infrastructure;

pipeline; sequencing data.

11.1 Introduction

To paraphrase Douglas Adams, biological and medical data sets can be

big, really big, I mean mind-bogglingly huge, and are only getting bigger.

Recent years have seen an intimidating increase in the quantity of data

collected in experiments performed by researchers across all fi elds of the

medical and life sciences. As geneticists, genomicists, molecular biologists

and biomedical scientists, we have seen the nature of our work

transformed by huge advances in DNA sequencing, high-throughput

drug discovery, high content screening and new technologies for

fl uorescence microscopy, but this transformation has come at a price. We

Search WWH ::

Custom Search

Home