Biology Reference
In-Depth Information
from present data under the (explicit or implicit) assumption of
a model of sequence evolution.
In practice, most MSA methods muddle the distinction
among homology-, structure-, or function-motivated alignment
by employing strategies anchored in inconsistent objectives.
Indeed, almost all well-established aligners assume and exploit
evolutionary relationships among the sequences (e.g., by con-
structing the alignment using an explicitly phylogenetic guide tree
and alignment scores derived from models of sequence evolution).
Yet many use at the same time structural criteria in their parameters
or heuristics, for example by training their parameters using
structure-derived reference alignments [ 10 , 11 ]. The complications
of the strategies different aligners employ can however be divorced
from the measurement of their success, and we wish to make no
assumption that an aligner employing one strategy necessarily per-
forms better when assessed according to criteria consistent with its
internal methods. In the present context of alignment benchmark-
ing, we therefore treat aligners as “black boxes” and refer the reader
interested in the specifics of alignment methods to later chapters.
As mentioned in the introduction, benchmarks provide ways of
evaluating the performance of different MSA packages on standar-
dized input. The output produced by the different programs is
compared to the “correct” solution, the so-called gold standard,
that is defined by the benchmark. The extent of similarity between
the two then defines the quality of the aligner's performance.
Proper benchmarking is advantageous to both the user and the
developer community: the former obtains standardized measures of
performance that can be consulted in order to pick the most
appropriate MSA tools to address a particular alignment problem,
and the latter gains important insight into aspects of the software
that need improvement, or new features to be implemented, thus
promoting advancement of the field [ 2 ].
Which characteristics do benchmarks and the gold standard
reference dataset need to satisfy in order to be useful to the user
and developer community? Benchmarks can be critically examined
by looking at their ability to yield performance measures that
reflect the actual biological accuracy (whether defined in terms of
shared evolutionary history or structural or functional similarity
of the aligned sequence data) of the MSA method. This can most
easily be done by defining a set of predetermined criteria for good
benchmarking practice. We follow Aniba et al. [ 2 ] in their list of
desirable properties of benchmarks, which states that a benchmark
should be:
1.2 Aims and
Desirable Properties
of Alignment
Benchmarks
l Relevant , in that a benchmark should be reflective of actual MSA
applications, i.e., tasks carried out by MSA in practice and not in
an artificial or hypothetical setting.
Search WWH ::




Custom Search