Biomedical Engineering Reference
In-Depth Information
8.3.2
Drawbacks of the Least-Squares Paradigms
of Phylogenetic Estimation
Although the least-squares paradigm is a milestone in molecular phylogenetics,
it is characterized by a number of drawbacks. For example, Cavalli-Sforza and
Edwards' paradigm returns a
tree metric
, i.e., a phylogeny whose edge weights
are non-negative [
73
,
80
], whenever the distance matrix
D
satisfies the
ultrametric
property
d
rs
max
fd
rq
;d
qs
g
r; s; q
2
W r
¤ s ¤ q
or the
additive property
d
rs
C d
hk
max
fd
rh
C d
sk
;d
rk
C d
sh
g
r; s; h; k
2
W r
¤ s ¤ h ¤ k:
Specifically, when
D
is ultrametric or additive, the solution of Problem
8.2
is unique
and obtainable in polynomial time through the UPGMA greedy algorithm [
74
]or
the sequential algorithm [
80
], respectively.
Unfortunately, when
D
is generic (e.g., when it is obtained by means of the THM
model, see Sect.
8.2
), the least-squares paradigm may lead to the occurrence of neg-
ative entries in the vector
w
, i.e., to a phylogeny that is not a tree metric [
32
,
47
].
Negative edge weights are infeasible both from a conceptual point of view (a dis-
tance, being an expected number of mutation events over time, cannot be negative
[
45
]) and from a biological point of view (evolution cannot proceed backwards
[
57
,
77
]). For the latter reason at least, non-tree metric phylogenies are generally
not accepted in molecular phylogenetics [
35
].
In response, some authors investigated the consequences of adding or guarantee-
ing the positivity constraint of edge weights in the least-squares paradigm.
Gascuel and Levy [
33
] observed that the presence of the positivity constraint
transforms any least-square model into a non-negative linear regression problem
which involves projecting the distance matrix
D
onto the positive cone defined by
the set of tree metrics (see also [
5
, p. 187]). Thus, the authors designed an iterative
polynomial time algorithm able to generate a sequence of least-squares projec-
tions of
D
onto such a set until an additive distance matrix (and the corresponding
phylogeny) is obtained.
Farach et al. [
26
] proposed an alternative approach to impose the positivity con-
straint. Specifically, the authors proposed to find the minimal perturbation of the
distance matrix
D
that guarantees the satisfaction of the additive or the ultrametric
property. Farach et al. [
26
] proposed the
L
1
-norm to constraint the
entries of
D
to satisfy the additive (ultrametric) property, and proved that such a
problem can be solved in polynomial time when
D
is required to be ultrametric
under the
L
1
-norm and
L
1
-norm. By contrast, the authors proved that their approaches be-
come hard when an ultrametric or an additive distance matrix is required under
the
L
1
-norm.
Search WWH ::
Custom Search