Biomedical Engineering Reference
In-Depth Information
merge-sort algorithm adapted to handle intervals. An intermediate buffer
keeps all the overlaps of index regions with the current query, as they
may also overlap with the next query.
INPUTS: query intervals Q and index intervals I.
INPUTS: query intervals Q and index intervals I.
PSEUDOCODE:
# Q and I are read sequentially as input streams
1.
PSEUDOCODE:
# Q and I are read sequentially as input streams
1.
q = next (Q)
q = next (Q)
# read fi rst query
# read fi rst query
2.
2.
i = next (I)
i = next (I)
# read fi rst interval
# read fi rst interval
3.
3.
B = { }
B = { }
# buffer for local overlaps
# buffer for local overlaps
4.
4.
while (q) {
while (q) {
5.
5.
if (q < i) q = next(Q)
if (q < i) q = next(Q)
# read next query
# read next query
6.
6.
else if (q > i) i = next (BUI)
else if (q > i) i = next (BUI)
# read next interval
# read next interval
7.
7.
else {
else {
8.
8.
B = { }
B = { }
9.
9.
while (overlap(q,i)) {
while (overlap(q,i)) {
10.
10.
print q,i
print q,i
# print overlaps
# print overlaps
11.
11.
B = BUI
B = BUI
# store interval in buffer
# store interval in buffer
12. i = next (I)
13. }
14. }
15. }
12. i = next (I)
13. }
14. }
15. }
UnsortedGenomicRegionSetOverlaps is used on unsorted region sets.
The algorithm used here is a modifi cation of the algorithm proposed by
Kent et al . [17], where we allow the number of levels and the number of
bins per level to be chosen arbitrarily.
The example below demonstrates the use of both derived classes (this
is taken from the source code fi le 'genomic_overlaps.cpp').
￿ ￿ ￿ ￿ ￿
# include 'core.h'
# include 'genomic_intervals.h'
# include 'core.h'
# include 'genomic_intervals.h'
// open region sets
char *REF_REG_FILE = 'exons.bed';
char *TEST_REG_FILE = 'rnaseq.reads.bed';
GenomicRegionSet RefRegSet(REF_REG_FILE,BUFFER_SIZE,VERBOSE,true);
GenomicRegionSet TestRegSet(TEST_REG_FILE,BUFFER_SIZE,VERBOSE,false);
// open region sets
char *REF_REG_FILE = 'exons.bed';
char *TEST_REG_FILE = 'rnaseq.reads.bed';
GenomicRegionSet RefRegSet(REF_REG_FILE,BUFFER_SIZE,VERBOSE,true);
GenomicRegionSet TestRegSet(TEST_REG_FILE,BUFFER_SIZE,VERBOSE,false);
// process overlaps
GenomicRegionSetOverlaps *overlaps;
if (IS_SORTED) overlaps = new SortedGenomicRegionSetOverlaps(&TestRegSet,
&RefRegSet,false);
else overlaps = new UnsortedGenomicRegionSetOverlaps(&TestRegSet,&RefReg
Set);
unsigned long int *coverage;
// process overlaps
GenomicRegionSetOverlaps *overlaps;
if (IS_SORTED) overlaps = new SortedGenomicRegionSetOverlaps(&TestRegSet,
&RefRegSet,false);
else overlaps = new UnsortedGenomicRegionSetOverlaps(&TestRegSet,&RefReg
Set);
unsigned long int *coverage;
 
Search WWH ::




Custom Search