Biology Reference
In-Depth Information
In this chapter, we outline MSAProbs [ 2 ], a progressive
alignment-based multiple protein sequence alignment algorithm,
and give some practical guidance on how to use this software.
MSAProbs employs a hybrid combination of a pair hidden Markov
model (pair-HMM) [ 3 ] and a partition function [ 4 ] to calculate
pairwise posterior probabilities for sequence pairs. Furthermore,
weighted probabilistic consistency transformation, weighted profi-
le-profile alignment, and random-split iterative refinements are
incorporated to progressive alignment to further improve align-
ment accuracy. In addition, this algorithm has been parallelized
using multi-threading to leverage the compute power of multi-
core CPUs. Evaluated using some popular benchmarks such as
BAliBASE [ 5 ] and PREFAB [ 6 ], MSAProbs has been demon-
strated to be one of the most accurate MSA algorithms in some
recent studies [ 7 - 9 ]. While yielding high alignment accuracy, our
algorithm has also demonstrated competitive execution speed
compared to other top performing MSA algorithms.
2 Materials
Standard personal computers and workstations based on multi-core
CPUs.
2.1 Hardware
1. Program name : MSAProbs.
2. Home page : http://msaprobs.sourceforge.net .
3. Operating system : Linux or Windows.
4. Programming language : C++.
5. Parallelization : Multi-threaded using OpenMP.
2.2
Software
3 Methods
MSAProbs is basically designed based on the progressive alignment
pipeline that typically consists of three stages, namely, pairwise
sequence distance matrix computation, guide-tree construction,
and profile-profile progressive alignment. However, in MSAProbs,
some extensions to the basic pipeline have been introduced.
MSAProbs works in five major stages: (1) calculating all pairwise
posterior probability matrices using both a pair-HMM and a partition
function; (2) calculating a pairwise sequence distance matrix from
pairwise posterior probability matrices; (3) constructing a guide
tree from the pairwise sequence distance matrix; (4) performing a
weighted probabilistic consistency transformation of all pairwise pos-
terior probability matrices; and (5) computing a profile-profile pro-
gressive global alignment along the guide tree using the transformed
posterior probability matrices. In addition, an optional
3.1 Program
Workflow
iterative
 
Search WWH ::




Custom Search