Biomedical Engineering Reference
In-Depth Information
use of a PBO solver instead of an ILP solver; second, symmetry breaking and reduc-
tion of the size of the model; finally, the integration of lower bounds and cardinality
constraints.This section describes these RPoly features and, moreover, extends the
RPoly model to include genotypes with missing sites.
The RPoly model associates two haplotypes (h i
, h i
,
and conditions are defined which capture when a different haplotype is used for
explaining a given genotype.
RPoly associates variables only with heterozygous sites. Note that homozygous
sites do not require variables because the value of the haplotypes explaining ho-
mozygous sites is known beforehand and so can be implicitly assumed. Therefore,
a Boolean variable t ij is associated with each heterozygous site g ij , such that,
) with each genotype g i 2 G
( 1 if h ij
h ij
D 1
^
D 0
t ij D
(7.13)
0 if h ij
h ij
D 0
^
D 1:
This alternative definition of the variables associated with the sites of genotypes
reduces the number of variables by a factor of 2. In addition, the model only creates
variables for heterozygous sites, and therefore the number of variables associated
with sites equals the total number of heterozygous sites. It should be mentioned
that this definition of the variables associated with sites follows the SHIPs model
[ 29 , 30 ].
Hence, the existing symmetry in haplotype pairs described in SHIPs model is
broken by considering that t ij
D 0 for each first heterozygous site g ij of each geno-
type g i ,
g ij
O j<j ) g ij ¤ 2
D 2
^8 j
H) : t ij :
(7.14)
Candidate haplotypes for each genotype are related to candidate haplotypes for
other genotypes only if the two genotypes are compatible. Clearly, incompatible
genotypes cannot be explained by common haplotypes. In practice, for candidate
haplotypes h i
and h q
k
2f a; b g and 1 k<i n), a Boolean variable x pq
ik
(p; q
is defined, such that x pq
ik
is 1 if haplotype h i
and haplotype h q
k
of
genotype g k are different. Two incompatible genotypes are guaranteed not to be ex-
plained by the same haplotype, and therefore, for the four possible combinations of
p and q, x pq
of genotype g i
ik D 1. Moreover, two genotypes g i
and g k are related only to respect to
sites for which either g i
or g k is heterozygous at that site. Therefore, the conditions
on the x pq
ik
variables are all of the following form, for all 1 j
m,
: .R , S/ ) x pq
ik
;
(7.15)
where the propositions R and S depend on the values of the sites g ij and g kj ,andalso
on the haplotype being considered, i.e., either h a or h b . Observe that 1 k<i n,
1
j
2f a; b g . Accordingly, the R and S propositions are defined
m,andp; q
as follows:
If g ij
¤ 2 ^ g kj D 2,thenR D .g ij
, .q , a// and S
D t kj .
Search WWH ::




Custom Search