Biomedical Engineering Reference
In-Depth Information
Figure 8.8
Example of window-based peaks in bed format
8.5.5 Identifying enriched Gene Ontology terms
In a given biological context, for example a tissue type or disease
state, certain proteins (transcription factors or modifi ed histones)
tend to bind to genes of specifi c functional categories. Gene enrichment
analysis can identify these categories. In the GenomicTools
platform, enrichment analysis is performed using the permutation_test
tool, which performs row permutations of inputs comprising
measurements and annotations (see below). Using the ChIP-seq
peaks computed above, we fi rst calculate their densities across gene
TSS regions - fl anked by 10 kb - using the 'density' operation of the
genomic_overlaps tool:
$ cat peaks.bed | genomic_overlaps density -v -i TSS.10kb.bed > tss.val
$ cat peaks.bed | genomic_overlaps density -v -i TSS.10kb.bed > tss.val
$ head tss.val
ENSMUSG00000090025:ENSMUST00000160944
$ head tss.val
ENSMUSG00000090025:ENSMUST00000160944 0.0000e+00
ENSMUSG00000064842:ENSMUST00000082908 0.0000e+00
ENSMUSG00000051951:ENSMUST00000159265 0.0000e+00
. . .
0.0000e+00
ENSMUSG00000064842:ENSMUST00000082908
0.0000e+00
￿ ￿ ￿ ￿ ￿
ENSMUSG00000051951:ENSMUST00000159265
0.0000e+00
...
Then, suppose we have a fi le containing gene annotations in a
TAB-separated format where the fi rst column is a gene id and the
second column is a SPACE-separated list of annotations for the
corresponding gene. The fi le must be sorted by gene id. For our example,
we will use 'gene.go', which contains annotations from the Gene
Ontology [29]:
 
Search WWH ::




Custom Search