Biology Reference
In-Depth Information
( a , b )
( a , c )
( b , c )
a T - A G T G
b T G A G T T
a T A - - - G T G
c T A C G C G - G
b T - - G A G T T
c T A C G C G - G
A
a
T A G T G
a
T A G T G
b
T G A G T T
B
b T G A G T T
c T A C G C G G
c T A C G C G G
a
T A G T G
a
T A G T G
b
T G A G T T
C
b T G A G T T
c T A C G C G G
c T A C G C G G
a T - - - A G T G
a
T A G T G
b T - - G A G T T
b T G A G T T
D
E
c T A C G C G - G
c T A C G C G G
Fig. 1 Maximal consistency and maximal weight traces. (A) Pairwise alignments between three sequences
a
¼ TACGCGG. (B) Traces. (C) Bipartite graphs. (D) Maximal weight traces.
(E) Consistently aligned regions (shadowed area). The consistent traces and edges are indicated by bold lines
in (D) and (E), respectively. The broken line in (D) indicates a trace that is omitted in the final MSA
¼ TAGTG, b
¼ TGAGTT, and c
definition is slightly more general than the trace-based consistency.
However, the latter is often used in consistency transformation, as
described below.
The idea of consistency transformation was first proposed by
Notredame et al. [ 23 ] to score an aligned residue pair. For the
above example, a bonus (weight) is added to S ( a i , b j ), if a i and b j are
indirectly paired through c k , i.e., if a i ~ c k , and b j ~ c k are present in
the pairwise alignments
( a , c ) and
( b , c ), respectively, irrespec-
A
A
tive of the presence of a i ~ b j in
( a , b ). When we are aligning N
sequences, the number of the “intermediate” sequences like c
amounts to N
A
2, with all of which the indirect pairing is exam-
ined. While this type of consistency-enhanced scoring system was
first used as the objective function to be optimized by a stochastic
algorithm [ 23 ], the same group soon came up with more efficient
MSA program named T-Coffee that adopts a progressive method
(see next subsection) [ 24 ]. Though named “consistency objective
function,” the scoring system of T-Coffee is more tightly related to
the maximum weight trace problem studied by Kececioglu [ 25 ]
than the concept of consistency discussed in the previous para-
graph. T-Coffee assigns a reasonable but somewhat ad hoc weight
to an indirectly aligned pair. Do et al. [ 19 ] elaborated theoretically
more sound approach in their MSA program ProbCons, in which
the posterior probability p i , j defined by Eq. 3 is used as the weight.
Hence, their method is called probabilistic consistency transforma-
tion. Pairwise alignment between the N input sequences and
Search WWH ::




Custom Search