Biomedical Engineering Reference
In-Depth Information
: h kj _: s ki ^ : h kj _: s ki :
(7.1)
If a site g ij is 1, then the model requires, for k D 1;:::;r,
h kj _: s ki ^ h kj _: s ki :
(7.2)
Otherwise, one requires that the haplotypes explaining genotype g i
have opposite
values at site i . This is done by creating two variables, g ij
2f 0; 1 g and g ij
2f 0; 1 g ,
such that g ij
¤ g ij . In CNF, the model requires two clauses,
g ij
_ g ij
_: g ij :
: g ij
^
(7.3)
In addition, the model requires, for k D 1;:::;r,
h kj _: g ij
_: s ki ^ : h kj _ g ij
_: s ki
^ h kj _: g ij
_: s ki ^ : h kj _ g ij
_: s ki :
(7.4)
Clearly, for each value of i ,andfora and b, it is necessary that exactly one haplo-
type is used, and so exactly one selector variable be assigned value 1. This can be
captured with cardinality constraints,
r X
!
r X
!
s ki D 1
s ki D 1
^
:
(7.5)
k D 1
k D 1
The SHIPs model can also be described using a matrix formulation G D S a
H ˚ S b H ,whereG is a n m matrix describing the genotypes, H is a r m
matrix of haplotype variables, S a and S b are the n r matrices of selector variables
and ˚ is the explanation operation.
Theorem 7.1. (Space Complexity) If a solution is found with r f
haplotypes, then
the number of constraints of the SAT model is
O .r f nm / ,whichis
O .n 2 m/ .In
O .n 2 C nm / .
addition, the number of variables is
O . nm C r f m C r f n/ ,whichis
The core SHIPs model is not effective in practice. As a result, several key
improvements have been developed, which are essential for obtaining significant
performance gains over existing approaches.
A crucial technique is the utilization of a tight lower bound estimate, which
reduces the number of iterations of the algorithm, but also effectively prunes the
search space. The computation of lower bounds is discussed in Sect. 7.5.2 .
One additional key technique for pruning the search space is motivated by
observing the existence of symmetry in the problem formulation. Consider two
haplotypes h k 1 and h k 2 , and the selector variables s k 1 i
, s k 2 i
, s k 1 i
and s k 2 i
.Further-
more, consider Boolean valuations v x and v y to the sites of haplotypes h k 1 and h k 2 .
Then, h v x
k 1
and h v y
k 2
1001, corresponds to h v y
k 1
and h v x
k 2
, with s k 1 i s k 2 i s k 1 i s k 2 i
D
,
Search WWH ::




Custom Search