Biomedical Engineering Reference
In-Depth Information
experience. As a more real yardstick for the quantities being produced, it
is interesting to note that the United States Library of Congress aims to
keep a copy of each written work in all languages. It carries about 15 TB
of data in this form. In comparison, a single DNA sequencer such as the
Illumina HiSeq 2000 can easily generate 55 000 000 000 nucleotides (55
gigabases) [2] of sequence in one day and would take only ten months to
exceed the total written output in the Library of Congress, essentially the
writings of the whole of humanity since the dawn of time. Given that
there are many thousands of DNA sequencers and other high-throughput
machines in active use across the world, then the word deluge does not
even begin to describe the current data situation.
The human and monetary resources that bioinformatics departments
and core services can call on to deal with the data are being stretched ever
thinner as the number of projects that make use of these technologies in
their departments increases. To add to this diffi culty, even if budgets
could be stretched to employ more people there is currently a short-fall in
the number of qualifi ed bioinformaticians in the job market who can
carry out the required analyses. There is not likely to be a change in this
situation very soon so it is necessary that service providers and research
groups fi nd a way to prioritise the many challenges they face and apply
the resources that they have to 'work smarter'. The wide range and
fl exibility of free and open source tools available now provide a great
opportunity for us to create environments and pipelines with which we
can tackle the data deluge. In this case study we shall describe how our
small core bioinformatics service has implemented a service model that is
more scalable and adaptable than previously extant schemes by making
use of free and open source software.
￿ ￿ ￿ ￿ ￿
11.2 Our service and its goals
The Sainsbury Laboratory (TSL) [3] is a part publicly, part privately
funded research institution that concentrates on cutting-edge basic and
translational research into plant and pathogen interactions. Conceived
by Lord Sainsbury, the former UK Science Minister and funded by The
Gatsby Charitable Foundation [4], TSL is a focussed laboratory of about
80 scientists in fi ve research groups. The bioinformatics group was
created in 2003 and had expanded to its current size of two full-time
members by 2006 when we took delivery of our fi rst Illumina Genome
Analyzer (GA) Next Generation DNA sequencing (NGS) machine, since
upgraded to GAII and we have expanded with multiple mass spectrometers
 
Search WWH ::




Custom Search