Biomedical Engineering Reference
In-Depth Information
At the forehead of human variation at genetic level are single nucleotide
polymorphisms (SNPs). An SNP is a single DNA position where a mutation has
occurred and one nucleotide was substituted with a different one. Moreover, the
least frequent nucleotide must be present in a significant percentage of the popula-
tion (e.g., 1%). SNPs are the most common genetic variation. The human genome
has millions of SNPs [ 42 ], which are cataloged in dbSNP, 3 the public repository for
DNA variations [ 40 ].
Haplotypes correspond to the sequence of SNPs in a single chromosome which
are inherited together. Humans are diploid organisms, which mean that our genome
is organized in pairs of homologous chromosomes, representing the maternal and
paternal chromosome. Therefore, each individual has two haplotypes for a given
stretch of the genome. Genotypes correspond to the conflated data of homologous
haplotypes.
Technological limitations prevent geneticists from acquiring experimentally the
data from a single chromosome, the haplotypes. Instead, genotypes are obtained.
This means that at each DNA position it is possible to know whether the individ-
ual has inherited the same nucleotide from both parents (homozygous positions) or
distinct nucleotides from each parent (heterozygous positions). Nonetheless, in the
latter case, it is, in general, technologically infeasible to determine which nucleotide
was inherited from each parent. The problem of obtaining the haplotypes from the
genotypes is known as haplotype inference.
Information about human's haplotypes has significant importance in clinic med-
icine [ 8 ]. Haplotypes are more informative than genotypes and, in some cases, can
predict better the severity of a disease or even be responsible for producing a specific
phenotype. In some cases of medical transplants, patients who match the donor hap-
lotypes closely are predicted to have more success on the transplant outcome [ 35 ].
Moreover, medical treatments could be customized based on patient's genetic
information, because individual responses to drugs can be attributed to a specific
haplotype [ 15 ]. Furthermore, haplotypes can help inferring population histories.
Despite being an important biological problem, haplotype inference turned also
to be a challenging mathematical problem and, therefore, has deserved significant
attention by the mathematical and computer science communities. The mathemat-
ical approaches to haplotype inference can be statistical [ 4 , 41 ] or combinatorial
[ 6 , 18 , 19 ]. Within the combinatorial methods, the haplotype inference by pure par-
simony (HIPP) approach [ 19 ] is noteworthy. The pure parsimony approach aims at
finding the haplotype inference solution which uses a smaller number of haplotypes.
The HIPP problem is APX-hard [ 28 ].
Boolean satisfiability (SAT) has been successfully applied in a significant number
of different fields [ 33 ]. The application of SAT-based methodologies in haplotype
inference has been shown to produce very competitive results when compared to
alternative methods [ 17 , 31 ]. SAT-based models currently represent the state of the
art on HIPP and, therefore, are the main focus of this chapter.
3 http://www.ncbi.nlm.nih.gov/projects/SNP
Search WWH ::




Custom Search