Biomedical Engineering Reference
In-Depth Information
8.3.2
Drawbacks of the Least-Squares Paradigms
of Phylogenetic Estimation
Although the least-squares paradigm is a milestone in molecular phylogenetics,
it is characterized by a number of drawbacks. For example, Cavalli-Sforza and
Edwards' paradigm returns a tree metric , i.e., a phylogeny whose edge weights
are non-negative [ 73 , 80 ], whenever the distance matrix D satisfies the ultrametric
property
d rs max fd rq ;d qs g
r; s; q
2
W r
¤ s ¤ q
or the additive property
d rs C d hk
max fd rh C d sk ;d rk C d sh g
r; s; h; k
2
W r
¤ s ¤ h ¤ k:
Specifically, when D is ultrametric or additive, the solution of Problem 8.2 is unique
and obtainable in polynomial time through the UPGMA greedy algorithm [ 74 ]or
the sequential algorithm [ 80 ], respectively.
Unfortunately, when D is generic (e.g., when it is obtained by means of the THM
model, see Sect. 8.2 ), the least-squares paradigm may lead to the occurrence of neg-
ative entries in the vector w , i.e., to a phylogeny that is not a tree metric [ 32 , 47 ].
Negative edge weights are infeasible both from a conceptual point of view (a dis-
tance, being an expected number of mutation events over time, cannot be negative
[ 45 ]) and from a biological point of view (evolution cannot proceed backwards
[ 57 , 77 ]). For the latter reason at least, non-tree metric phylogenies are generally
not accepted in molecular phylogenetics [ 35 ].
In response, some authors investigated the consequences of adding or guarantee-
ing the positivity constraint of edge weights in the least-squares paradigm.
Gascuel and Levy [ 33 ] observed that the presence of the positivity constraint
transforms any least-square model into a non-negative linear regression problem
which involves projecting the distance matrix D onto the positive cone defined by
the set of tree metrics (see also [ 5 , p. 187]). Thus, the authors designed an iterative
polynomial time algorithm able to generate a sequence of least-squares projec-
tions of D onto such a set until an additive distance matrix (and the corresponding
phylogeny) is obtained.
Farach et al. [ 26 ] proposed an alternative approach to impose the positivity con-
straint. Specifically, the authors proposed to find the minimal perturbation of the
distance matrix D that guarantees the satisfaction of the additive or the ultrametric
property. Farach et al. [ 26 ] proposed the
L 1 -norm to constraint the
entries of D to satisfy the additive (ultrametric) property, and proved that such a
problem can be solved in polynomial time when D is required to be ultrametric
under the
L 1 -norm and
L 1 -norm. By contrast, the authors proved that their approaches be-
come hard when an ultrametric or an additive distance matrix is required under
the
L 1 -norm.
Search WWH ::




Custom Search