Biology Reference
In-Depth Information
Table 1
The advantages and risks of the four approaches to MSA benchmarking. Examples are given of
relevant software packages, benchmark databases and tests
Approach
Advantages
Risks
Examples
References
Simulation-
based
Solvability: “true”
homology is known
Relevance: simulated data
might strongly differ
from real biological data
Rose
[ 12 ]
Evolving: different
scenarios can be
modelled
Independence: MSA
parameters might
resemble those used
in simulation
DAWG
[ 13 ]
Scalability: new data
can be generated
ad libitum
EvolveAGene3 [ 14 ]
iSGv2.0
[ 48 ]
INDELible
[ 15 ]
PhyloSim
[ 16 ]
ALF
[ 18 ]
Consistency-
based
Scalability: not
constrained to
a particular
reference set
Relevance: consistent MSA
methods may be
collectively biased
MUMSA
[ 26 , 49 ]
Accessibility: tests are
easy and quick
Independence: similar
scores might be used
in MSA inference
HoT
[ 27 ]
Structure-
based
Relevance: closely
matches a major
biological objective
of MSA
Relevance: limited to structurally
conserved regions; biological
objective of MSA may vary
HOMSTRAD [ 10 , 30 ]
Independence: empirical
data is used as input
Scalability: only applicable
to small subset of protein
sequences
OXBench [ 40 ]
PREFAB [ 33 ]
SABMARK [ 32 ]
BAliBASE 3.0 [ 11 , 31 ]
STRIKE
[ 50 ]
Phylogeny-
based
Relevance: closely
matches a major
biological objective
of MSA
Relevance: biological objective
of MSA may vary from
phylogenetic reconstruction
Species-tree
discordance
test
[ 44 ]
Independence: empirical
data is used as input
Minimum
duplication
test
[ 44 ]
Scalability: broad array
of sequence data can
be used as input
are not proper metrics (they do not satisfy the conditions of sym-
metry or triangle inequality), which has motivated the recent devel-
opment of better-founded alternatives [ 20 ].
Besides the advantage of knowing the true alignment, the fact
that the parameters for simulated sequence evolution are user-
defined directly translates into great flexibility to address specific
questions or to investigate the effect of
individual
factors in
Search WWH ::




Custom Search