Biomedical Engineering Reference
In-Depth Information
Memory evaluation of the overlap operation presented
in Figure 8.10. Memory requirements for the IRanges
package of Bioconductor increase with input size, which
makes it impossible to handle big data sets (in this
particular hardware setup of 12 GB memory we could
only compute overlaps for up to 32 million reads).
BEDTools uses a fi xed amount of memory, which
depends on the size of the reference input set (i.e. exons
and repeat elements). GenomicTools uses no signifi cant
amount of memory, as all input fi les are read
sequentially, but, of course, it has to rely on sorted inputs
Figure 8.10
and is therefore well suited for large-scale Bioinformatics projects.
Additionally, GenomicTools makes virtually no use of memory, unlike
IRanges and BEDTools, both of which use a signifi cant amount of memory
(Figure 8.10), thus limiting the number of such processes that can
simultaneously run on the same system.
￿ ￿ ￿ ￿ ￿
8.7 Conclusion
It is becoming increasing apparent that humanity in general, and science
in particular, can greatly benefi t from properly applying the principles of
openness, transparency, and sharing of information. The free open source
initiatives in recent years have led to new interesting phenomena, such as
crowd-sourcing, which bring together entire communities to everyone's
benefi t in some sort of collaborative competition. This notion of
 
Search WWH ::




Custom Search