Biomedical Engineering Reference
In-Depth Information
Since every of the before mentioned conditional distributions needed for randomly
drawing one of the respective possible choices can be derived in linear time (during
the sampling process), any valid 3
O
( n ) . Thus, since
base pair can be sampled in time
n− min HL
2
any structure of size n can have at most
base pairs, a random candidate
structure for the given input sequence can be generated in
( n 2 ) time.
Thus, one straightforward approach for improving the performance of the overall
sampling algorithm in the worst-case is to reduce the
O
( n 3 ) time complexity required
for the preprocessing step at least to the quadratic time of the sampling strategy. To
us, this means we might be able to save a significant amount of time by replacing the
exact inside-outside calculations with a corresponding heuristic method yielding only
approximative inside-outside values for a given input sequence. To see if this might ac-
tually be successful, we next want to determine to which extend the sampling strategy
reacts to different types and degrees of disturbances in the inside and outside proba-
bilities in order the get evidence if it could actually be possible to find an appropriate
heuristic.
O
2.2
Considered Disturbance Types and Levels
We decided to disturb the exact inside and outside probabilities for a given input se-
quence r of length n in the following ways: For each X
∈I G s
and 1
i,j
n ,
redefine the corresponding inside value according to
α X ( i,j ) := max(min( α X ( i,j )+ α err , 1) , 0) ,
(3)
where α err is randomly chosen from the following interval or set:
[
max ErrPerc α A ( i,j ) , +max ErrPerc α A ( i,j )] or
{−
fix ErrPerc α A ( i,j ) , +fix ErrPerc α A ( i,j )
}
(relative errors), with max ErrPerc , fix ErrPerc
(0 , 1] defining percentages, or else,
[
max ErrVal , +max ErrVal ] or
{−
fix ErrVal , +fix ErrVal }
(absolute errors), with max ErrVal , fix ErrVal
(0 , 1] being fixed values. Random errors
on all outside values β X ( i,j ) , X
∈I G s
and 1
i,j
n , can be generated in the same
way.
The needed conditional sampling distributions (as considered by a particular strat-
egy) are then derived from the exact grammar parameters and the disturbed inside-
outside probabilities for the input sequence. This might create the need to (slightly)
modify a particularly employed sampling strategy for being capable of dealing with
these skewed distributions, as we will see in Sect. 4.1.
3
One may for example consider only the 6 different most stable canonical pairs as valid ones
(like usually done in physics-based approaches due to missing thermodynamics parameters for
non-canonical pairs). However, we decided to drop this restriction, considering all possible
non-crossing base pairings to be valid.
 
Search WWH ::




Custom Search