Biomedical Engineering Reference
In-Depth Information
:
h
kj
_:
s
ki
^
:
h
kj
_:
s
ki
:
(7.1)
If a site g
ij
is 1, then the model requires, for k
D
1;:::;r,
h
kj
_:
s
ki
^
h
kj
_:
s
ki
:
(7.2)
Otherwise, one requires that the haplotypes explaining genotype g
i
have opposite
values at site i . This is done by creating two variables, g
ij
2f
0; 1
g
and g
ij
2f
0; 1
g
,
such that g
ij
¤
g
ij
. In CNF, the model requires two clauses,
g
ij
_
g
ij
_:
g
ij
:
:
g
ij
^
(7.3)
In addition, the model requires, for k
D
1;:::;r,
h
kj
_:
g
ij
_:
s
ki
^
:
h
kj
_
g
ij
_:
s
ki
^
h
kj
_:
g
ij
_:
s
ki
^
:
h
kj
_
g
ij
_:
s
ki
:
(7.4)
Clearly, for each value of i ,andfora and b, it is necessary that exactly one haplo-
type is used, and so exactly one selector variable be assigned value 1. This can be
captured with cardinality constraints,
r
X
!
r
X
!
s
ki
D
1
s
ki
D
1
^
:
(7.5)
k
D
1
k
D
1
The SHIPs model can also be described using a matrix formulation G
D
S
a
H
˚
S
b
H ,whereG is a n
m matrix describing the genotypes, H is a r
m
matrix of haplotype variables, S
a
and S
b
are the n
r matrices of selector variables
and
˚
is the explanation operation.
Theorem 7.1.
(Space Complexity) If a solution is found with
r
f
haplotypes, then
the number of constraints of the SAT model is
O
.r
f
nm
/
,whichis
O
.n
2
m/
.In
O
.n
2
C
nm
/
.
addition, the number of variables is
O
.
nm
C
r
f
m
C
r
f
n/
,whichis
The core SHIPs model is not effective in practice. As a result, several key
improvements have been developed, which are essential for obtaining significant
performance gains over existing approaches.
A crucial technique is the utilization of a tight lower bound estimate, which
reduces the number of iterations of the algorithm, but also effectively prunes the
search space. The computation of lower bounds is discussed in Sect.
7.5.2
.
One additional key technique for pruning the search space is motivated by
observing the existence of symmetry in the problem formulation. Consider two
haplotypes h
k
1
and h
k
2
, and the selector variables s
k
1
i
, s
k
2
i
, s
k
1
i
and s
k
2
i
.Further-
more, consider Boolean valuations
v
x
and
v
y
to the sites of haplotypes h
k
1
and h
k
2
.
Then, h
v
x
k
1
and h
v
y
k
2
1001, corresponds to h
v
y
k
1
and h
v
x
k
2
, with s
k
1
i
s
k
2
i
s
k
1
i
s
k
2
i
D
,
Search WWH ::
Custom Search