Biomedical Engineering Reference
In-Depth Information
of this implementation of sliding window computations is that it is done
sequentially without the need of storing the entire input intervals in
memory. A practical use of this class is to develop customized window-
based peak discovery algorithms. As shown in the example below, this
class can be used to determine the number of reads in signal and control
region sets in sliding windows along the entire genome.
# include 'core.h'
# include 'genomic_intervals.h'
// parameters
long int w = 500; // window size
long int d = 25; // window distance
bool verbose = true;
bool load_in_memory = false;
# include 'core.h'
# include 'genomic_intervals.h'
// parameters
long int w = 500; // window size
long int d = 25; // window distance
bool verbose = true;
bool load_in_memory = false;
// initialize: create input region sets and associated scanners
map<string,long int> *bounds = ReadBounds(genome_fi le);
GenomicRegionSet signalReg(signal_fi le,10000,verbose,load_in_memory);
GenomicRegionSet controlReg(control_fi le,10000,verbose,load_in_memory);
GenomicRegionSetScanner signal_scanner(&signalReg,bounds,d,w,false,false
,'c');
GenomicRegionSetScanner control_scanner(&controlReg,bounds,d,w,false,
false,'c');
// initialize: create input region sets and associated scanners
map<string,long int> *bounds = ReadBounds(genome_fi le);
GenomicRegionSet signalReg(signal_fi le,10000,verbose,load_in_memory);
GenomicRegionSet controlReg(control_fi le,10000,verbose,load_in_memory);
GenomicRegionSetScanner signal_scanner(&signalReg,bounds,d,w,false,false
,'c');
GenomicRegionSetScanner control_scanner(&controlReg,bounds,d,w,false,
false,'c');
// run: compute read counts in sliding windows for both fi les while
(true) {
long int n = signal_scanner.Next();
long int m = control_scanner.Next();
if (n==-1) break;
// ADD your statistical test HERE
}
// run: compute read counts in sliding windows for both fi les while
(true) {
long int n = signal_scanner.Next();
long int m = control_scanner.Next();
if (n==-1) break;
// ADD your statistical test HERE
}
￿ ￿ ￿ ￿ ￿
8.4.4 The GenomicRegionSetOverlaps class and
its extensions
This class is an abstract class used for determining and manipulating
overlaps between two regions sets. It is extended into two classes.
SortedGenomicRegionSetOverlaps is used on sorted region sets. The sort
order is fi rst by chromosome, then (optionally) by strand, and fi nally by
start position. The algorithm used to compute overlaps in this class is a
generalization of the standard merge-sort algorithm modifi ed so as to
handle intervals. As before, the main advantage of this implementation is
that processing is done sequentially without the need of storing the entire
input intervals in memory. The algorithm operates on sorted inputs,
scans the fi les sequentially and computes all overlaps essentially using a
 
Search WWH ::




Custom Search