PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

Chapter 13

PicXAA: A Probabilistic Scheme for Finding

the Maximum Expected Accuracy Alignment

of Multiple Biological Sequences

Sayed Mohammad Ebrahim Sahraeian and Byung-Jun Yoon

Abstract

PicXAA is a probabilistic nonprogressive alignment algorithm that finds protein (or DNA) multiple

sequence alignments with maximum expected accuracy. PicXAA greedily builds up the alignment from

sequence regions with high local similarity, thereby yielding an accurate global alignment that effectively

captures the local similarities across sequences. PicXAA constantly yields accurate alignment results on a

wide range of reference sets that have different characteristics, with especially remarkable improvements

over other leading algorithms on sequence sets with high local similarities. In this chapter, we describe the

overall alignment strategy used in PicXAA and discuss several

important considerations for effective

deployment of the algorithm.

Key words Multiple sequence alignment, Nonprogressive alignment, Maximum expected accuracy

(MEA), Probabilistic consistency transformation, PicXAA

1

Introduction

Multiple sequence alignment (MSA) is an indispensable tool in

comparative studies of biological sequences, and it plays a promi-

nent role in many applications such as phylogenetic analysis, struc-

ture prediction, function prediction, motif discovery, and modeling

sequence homology [ 1 - 7 ]. The mathematically optimal MSA can

be found using dynamic programming. However, the dynamic

programming approach has a high computational cost that renders

it impractical for aligning more than a few sequences. For this

reason, the progressive alignment scheme—which successively

aligns pairs of sequences (or sequence profiles) along a phylogenetic

tree of the given sequences—has gained popularity as a practical

alternative [ 8 - 16 ]. In fact, the progressive alignment technique is

surprisingly effective for closely related sequences and it yields

Search WWH ::

Custom Search

Home