Biomedical Engineering Reference
In-Depth Information
0.00e+00
0.00e+00
12
12
translation
translation
12
12
6.92e-03
6.92e-03
0.00e+00
0.00e+00
12
12
mitosis
mitosis
13
13
6.92e-03
6.92e-03
0.00e+00 13
membrane_lipid_metabolism
0.00e+00 13
membrane_lipid_metabolism
15
15
6.92e-03
6.92e-03
1.38e-05
1.38e-05
14
14
lymphocyte_activation
lymphocyte_activation
26
26
6.92e-03
6.92e-03
1.45e-05
1.45e-05
22
22
ubiquitin_cycle
ubiquitin_cycle
13
13
2.27e-02
2.27e-02
6.19e-05
6.19e-05
12
12
protein_catabolism
protein_catabolism
13 2.27e-02
13 2.27e-02
6.19e-05
6.19e-05
12
12
T_cell_activation
T_cell_activation
20
20
2.27e-02
2.27e-02
8.46e-05
8.46e-05
17
17
The output of the tool ranks each gene category according to the adjusted
p-values (i.e. q-values) from the most to the least signifi cant. In this
particular scenario, where the input values correspond to binding sites in
promoters as identifi ed by our ChIP-seq analysis, the results of the
enrichment analysis indicate that the assayed protein binds on genes that
play a role in lymphocyte activation. This kind of analysis can be very
powerful at generating novel biological hypothesis. Suppose that the
assayed protein was never shown to participate in lymphocyte activation.
Then, based on the evidence produced by the enrichment analysis above,
biologists can design further experiments to prove (or reject) this hypothesis.
8.6 Performance
Typically, during the course of computational analyses of sequencing
data, computational biologists experiment with different pre-processing
and discovery algorithms. GenomicTools is designed to take advantage
of sorted inputs (the sort order is chromosome → strand → start position)
to create effi cient pipelines that can handle numerous operations on
multiple data sets repeated several times under different parameters. In
the GenomicTools platform, the original data sets (e.g. the mapped reads
in BED format) need to be sorted only once at the beginning of the
project. As we show below, sorted inputs can lead to dramatic
improvements in performance.
We evaluated the time and memory usage of GenomicTools and
compared its performance to (1) the IRanges Bioconductor package [14];
and (2) the BEDTools suite [19]. The evaluation was run on a RHEL5/
x86-64bit platform with 12 GB of memory on the IBM Research Cloud.
Search WWH ::
Custom Search