Biomedical Engineering Reference
In-Depth Information
functional signifi cance of cis-regulatory regions identifi ed by localized
measurements of DNA binding events across an entire genome. The
USCS genome browser and IGB are web-based visualization platforms
that incorporate data from several public databases.
Despite the existence of these sophisticated tools for genome analysis,
they are not designed to effi ciently process multiple large data sets - such
as the ones obtained from high-throughput sequencing - and suffer from
poor memory management as essentially all data needs to be loaded into
memory and/or sent over the network.
In an attempt to address some of these issues, Quinlan and Hall have
developed the BEDTools suite [19]. We predict that the BEDTools
initiative will lead into a competition for a new set of tools focused on
processing genomic data as streams. This new set of tools will provide the
means to effi ciently handle large genomic data sets, thus providing a
computational platform that facilitates the development of bioinformatics
applications. These tools may then be integrated with or incorporated
into other bioinformatics tools/environments, including those mentioned
above.
The motivation behind GenomicTools (fi rst presented in [20] as an
'applications note') was to create a computational platform for developing
customized analytics for genomic data sets with minimal memory and
intermediate fi le requirements in order to address the bottleneck caused
by the increasing infl ux of genome-wide data sets. GenomicTools is
available both as command-line tools for building applications in a
UNIX-like environment and as documented C++ classes for further
development. The open source aspect of GenomicTools is important as it
allows users to easily incorporate their own analysis methods with the
published tools and to modify the tools to suit their specifi c data analysis
needs. Although similar in motivation to BEDTools, it is in several aspects
more general than BEDTools and it addresses several issues that BEDTools
do not adequately address. We summarize the novelty of GenomicTools
below.
￿ ￿ ￿ ￿ ￿
Novel operations; in GenomicTools the focus is not simply on overlap
computations as in BEDTools. GenomicTools is designed to perform a
variety of simple mathematical operations on set genomic intervals (as
a pre-processing step) and then a variety of complex operations can be
performed, such as overlap, offset or scanning computations
(Figure 8.2), that is a superset of the operations offered by BEDTools.
Relaxed data set restrictions: GenomicTools allows several of its
operations to operate on sets of genomic regions rather than sets of
 
Search WWH ::




Custom Search