Biomedical Engineering Reference
In-Depth Information
11
Squeezing big data into a small
organisation
Michael A. Burrell and Daniel MacLean
Abstract: The technological complexity involved in generating and
analysing high-throughput biomedical data sets means that we need
new tools and practices to enable us to manage and analyse our
data. In this chapter we provide a case study in setting up a
bioinformatics support service using free and open source tools and
software for a small research institute of approximately 80 scientists.
As far as possible our support service tries to empower the scientists
to do their own analyses and we describe the tools and systems we
have found useful and the problems and pitfalls we have encountered
while doing so.
Key words: genomics; NGS; institute; bioinformatics; infrastructure;
pipeline; sequencing data.
￿ ￿ ￿ ￿ ￿
11.1 Introduction
To paraphrase Douglas Adams, biological and medical data sets can be
big, really big, I mean mind-bogglingly huge, and are only getting bigger.
Recent years have seen an intimidating increase in the quantity of data
collected in experiments performed by researchers across all fi elds of the
medical and life sciences. As geneticists, genomicists, molecular biologists
and biomedical scientists, we have seen the nature of our work
transformed by huge advances in DNA sequencing, high-throughput
drug discovery, high content screening and new technologies for
fl uorescence microscopy, but this transformation has come at a price. We
 
Search WWH ::




Custom Search