Biomedical Engineering Reference
In-Depth Information
Problem 8.1. The phylogenetic estimation problem (PEP)
optimize
f.T/
s
:
t
:
. ; T/ D 0
T 2 T ;
where
is the set of molecular sequences from
n
taxa,
T
a phylogeny of
,
T
the
set of
.2n5/ŠŠ D 1357 2n5
phylogenies of
,
f W T ! R
a function
g W T ! R
modeling the selected criterion of phylogenetic estimation, and
a
T
function correlating the set
.
A specific optimization problem, or phylogenetic estimation paradigm ,iscom-
pletely characterized by defining the functions
to a phylogeny
T
f
and
g
. The phylogeny
that
T approaches the true
phylogeny as the amount of molecular data from taxa increases, the corresponding
criterion is said to be statistically consistent [ 32 ]. The statistical consistency is a
desirable property in molecular phylogenetics because it measures the ability of a
criterion to recover the true (and hopefully the real) phylogeny of the given molec-
ular data. Later in this chapter, we will show that the consistency property changes
from criterion to criterion and in some cases may be even absent.
Here, we provide a review of the main estimation criteria that occur in the liter-
ature on molecular phylogenetics. Particular emphasis is given to the comparative
description of the hypotheses at the core of each criterion and to the optimization
aspects related to the phylogenetic estimation paradigms. In Sect. 8.2 ,wediscuss
the problem of measuring the similarity among molecular sequences. In Sect. 8.3 ,
we discuss the fundamental least-squares paradigm and formalize the concept of
phylogeny. In Sect. 8.4 , we present the minimum evolution paradigm by evidencing
the recent perspectives and computational advances. Finally, in Sect. 8.5 we present
the likelihood and the bayesian paradigms by exposing briefly their benefits and
drawbacks.
optimizes
f
and satisfies
g
is referred to as optimal ,andif
8.2
Measuring Molecular Similarity
The degree of similarity between pairwise molecular sequences reflects the amount
of mutation events that occurred since they split from their common ancestor.
Quantifying such similarity constitutes the first step in the phylogenetic estimation
process [ 11 ]. The task involves the investigation and the modeling of the mutation
process over time, i.e., the process by which errors occur in molecular data and are
inherited between generations.
Different types of mutation may occur in the genome structure, most of which
are point mutations, i.e., changes that involve the replacement, or substitution ,of
one nucleotide for another in the DNA sequence. Point mutations can be classi-
fied in two categories: the transitions and the transversions. The transitions occur
Search WWH ::




Custom Search