Biology Reference
In-Depth Information
Chapter 3
Objective Functions
Haluk Do ˘ an and Hasan H. Otu
Abstract
Multiple sequence alignment involves alignment of more than two sequences and is an NP-complete
problem. Therefore, heuristic algorithms that use different criteria to find an approximation to the optimal
solution are employed. At the heart of these approaches lie the scoring and objective functions that a given
algorithm uses to compare competing solutions in constructing a multiple sequence alignment. These
objective functions are often motivated by the biological paradigms that govern functional similarities and
evolutionary relations. Most existing approaches utilize a progressive process where the final alignment is
constructed sequentially by adding new sequences into an existing multiple sequence alignment matrix,
which is dynamically updated. In doing this, the core scoring function to assess accuracies of pairwise
alignments generally remains the same, while the objective functions used in intermediary steps differ.
Nevertheless, the overall assessment of the final multiple sequence alignment is generally calculated by an
extension of pairwise scorings. In this chapter, we explore different scoring and objective functions used in
calculating the accuracy and optimization of a multiple sequence alignment and provide utilization of these
criteria in popularly used multiple sequence alignment algorithms.
Key word Scoring-functions, Scoring-matrices, Sum-of-pairs, Progressive-alignment, Consistency-
based-scoring
1
Introduction
Multiple sequence alignment plays a major role in various areas of
bioinformatics such as function and structure prediction, finding
highly conserved regions, and resolving evolutionary relationship
between a set of taxa. Alignment of biological sequences (DNA,
RNA, or protein) reveals significant similarities, characterizes con-
served regions, construes genetic diseases, and discovers functional
motifs. Multiple sequence alignment, abbreviated from now on as
MSA, is an extension of pairwise alignment where at least three
sequences are aligned and the problem can be defined as follows:
Definition : Let S be a set of sequences s 1
;
s 2
; ...;
s n of
lengths
l 1
;
l 2
; ...;
l n defined in some alphabet
Σ
(e.g.
Σ DNA ¼
{ A , C , G , T }).
Search WWH ::




Custom Search