Biology Reference
In-Depth Information
Chapter 4
Who Watches the Watchmen? An Appraisal
of Benchmarks for Multiple Sequence Alignment
Stefano Iantorno, Kevin Gori, Nick Goldman, Manuel Gil,
and Christophe Dessimoz
Abstract
Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique in bioinformatics used to
infer related residues among biological sequences. Thus alignment accuracy is crucial to a vast range of
analyses, often in ways difficult to assess in those analyses. To compare the performance of different aligners
and help detect systematic errors in alignments, a number of benchmarking strategies have been pursued.
Here we present an overview of the main strategies—based on simulation, consistency, protein structure,
and phylogeny—and discuss their different advantages and associated risks. We outline a set of desirable
characteristics for effective benchmarking, and evaluate each strategy in light of them. We conclude that
there is currently no universally applicable means of benchmarking MSA, and that developers and users
of alignment tools should base their choice of benchmark depending on the context of application—with
a keen awareness of the assumptions underlying each benchmarking strategy.
Key words Multiple sequence alignment, Benchmarking, Phylogenetic, Protein structure, Sequence
evolution, Consistency, Homology
1
Introduction
Multiple sequence alignment (MSA) has become a common first
step in the analysis of sequence data for downstream applications
such as comparative genomics, functional analysis and phylogenetic
reconstruction. Given their importance, MSA methods need to be
objectively validated in order to ensure their output is both accurate
and reproducible. Benchmarking is a crucial tool in the assessment
of sequence alignment programs, as it allows their developers and
users to compare the performance of different aligners objectively,
identify strengths and weaknesses and help detect systematic
errors in alignments. In recent years, there has been a growing
Stefano Iantorno and Kevin Gori contributed equally to this work.
 
Search WWH ::




Custom Search