Biomedical Engineering Reference
In-Depth Information
may connect from their local machines, usually Macintosh computers
and a small number of Windows machines. Simple shell scripts can be
double-clicked mounting all devices that are needed including the NetApp
storage device holding our data. Because of the age of the OS, this trick
cannot be achieved on our Windows XP machines, on which WinSCP
[24] is used instead, but can be done on Windows 7. All data can be
interacted with as if on the local machine if needed, and the user has
access to the high-performance compute cluster through the automatically
connected terminal. Jobs requiring the cluster are submitted and managed
with Platform's LSF system, a proprietary system. Within this framework
we allow all users full freedom in a public folder, into which they may
install and compile whatever software they need. This works well to
allow them to try out any solutions they like. Of course, this sort of setup
requires that the user has experience with the command-line interface for
the majority of analysis software, and, although biologists can be brought
quite quickly up to speed with this style of working, in most cases there
are now workfl ow and tool integration pipelines that take away much of
the pain a brand new user may feel.
The single most useful piece of software we have installed to this end
is Galaxy. Galaxy is a workfl ow-engineering environment in which a user
may easily combine data sets and command-line bioinformatics tools in
a graphical user interface to create and save analyses. The environment is
very powerful and intuitive to a biologist familiar with the point-and-
click paradigm of computer interfaces and they can immediately get
down to the work at hand. Other such environments exist, but in our
experience Galaxy is the simplest for the user and requires less intervention
from us. A further advantage is the extensive training videos that the
Galaxy Team provide to teach users (not administrators) how to carry
out common analyses. An awful lot of software comes bundled with
Galaxy and we have yet to come across a command-line tool that cannot
be integrated. However, the bias in the software provided and in the
themes of the training videos is very much towards the analysis of next-
generation sequencing data.
Administering a Galaxy installation can be quite straightforward, but
we found there are a few points worthy of comment. The community is
large and a lot of problems will have been encountered and solved before.
In testing the Galaxy system in a local install as a newbie then typically it
will get installed on a single machine, this will run fi ne and probably use
SQLite [25] for its job's database. The most useful lesson we learned in
moving to a production environment was to make sure we switched to a
PostGreSQL [26] database. The SQLite was able to cope for a few weeks
￿ ￿ ￿ ￿ ￿
 
Search WWH ::




Custom Search