Biomedical Engineering Reference
In-Depth Information
8.2.2
Estimating Evolutionary Distances from Molecular Data
Two molecular sequences
S 1
and
S 2 , evolving at time
t 0
from a common ances-
tor, could be characterized at time
by different amounts of substitution events,
some of which not directly observable. Hence, if we would sample the sequences
at time
t
and measure their similarity, or evolutionary distance , in terms of number
of observed differences, we could underestimate the overall substitution events that
occurred since
t
S 2 split from their common ancestor. A number of authors
suggested that the use of the time homogeneous Markov models could overcome the
underestimation problem in all those cases in which the hypotheses at the core of
the model would properly describe the real evolutionary process of the analyzed se-
quences [ 29 ]. Moreover, in order to compare the evolutionary distances of different
pairs of molecular sequences, the authors also proposed to express the evolution-
ary distances in terms of expected number of substitution events per site rather than
the time necessary to transform a sequence into another [ 29 ]. In this section, we
will present the most general formula currently known in the literature to compute
the evolutionary distance from pairwise molecular sequences. To this aim, we shall
investigate now the dynamics of the THM model.
As shown in Zadeh and Desoer [ 84 ], ( 8.4 ) can also be expressed in closed for-
mula as:
S 1
and
.t / D e R t
e t ˝ 1 ;
P
D ˝
(8.5)
where
is the diagonal matrix of the eigen-
values of R . This fact suggests that the spectrum of P
˝
is the eigenvector matrix of R ,and
.t /
is the exponential spectrum
of R , i.e., the dynamics of P
.t /
is univocally determined from the knowledge of the
spectrum of R [ 84 ].
It is worth noting that the Markov conservative hypothesis implies that the deter-
minant of matrix R is equal to zero, i.e., at least one of its eigenvalues is identically
zero. Moreover, since any
, has negative
determinant, for one of the Sylvester corollaries (see [ 6 , p. 409]) all the remaining
eigenvalues are negative. Thus, as the spectrum of P
k
-leading principal sub-matrix of R ,
k<4
.t /
is the exponential spectrum
.t /
of R ,matrix P
has at least one eigenvalue equal to 1, called the maximal Lyapunov
exponent , and three eigenvalues lying in the interval
Œ0; 1
. The maximal Lyapunov
exponent prevents the presence of chaotic attractors and guarantees that, as
t
goes
to infinity, the generic entry
p ij .t /
is non-zero and independent on the starting state
i 2
. In other words, the maximal Lyapunov exponent guarantees the existence of
four positive values
A ,
C ,
G ,and
T , called equilibrium frequencies , such that
t !1 p ij .t / D j
lim
8 i; j 2 :
The values
j constitute a stationary distribution and turn out to be useful to
measure the evolutionary distance between
S 1
and
S 2 .Infact,denote O
.t /
as a
matrix whose generic entry
o ij .t /
,
i; j 2
, represents the probability that at a
Search WWH ::




Custom Search