Multiple Protein Sequence Alignment with MSAProbs - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

sparse matrices, the memory footprint is still large for large-scale

datasets. Hence, we recommend the use of a computer with

a large amount of shared memory to compute MSA of large-

scale data. In the future, we plan to design an out-of-core

pairwise probability matrix computation by storing all matrices

on disk.

3. In our program, the profile-profile progressive alignment stage

has not yet been parallelized. Hence, for large-scale data,

this stage might become a parallel scalability bottleneck when

using multiple threads. In MSA-CUDA [ 15 ], a dynamic

scheduling parallelization has been proposed to parallelize the

profile-profile progressive alignment stage of ClustalW [ 16 ]on

graphics processing units. This method is also suitable for

parallelization based on multi-threading on CPUs. Hence,

our future work also includes the parallelization of this stage.

References

1. Feng DF, Doolittle RF (1987) Progressive

sequence alignment as a prerequisite to correct

phylogenetic trees. J Mol Evol 25:351-361

2. Liu Y, Schmidt B, Maskell DL (2010)

MSAProbs: multiple sequence alignment

based on pair hidden Markov models and

partition function posterior

secondary structure, solvent accessibility, and

residue-residue contacts. BMC Bioinformatics

12:472

10. Vingron M, Argos P (1989) A fast and sensitive

multiple sequence alignment algorithm. Com-

put Appl Biosci 5:115-121

11. Gotoh O (1990) Consistency of optimal

sequence alignments. Bull Math Biol 52:

509-525

12. Notredame C, Holm L, Higgins DG (1998)

COFFEE: an objective function for multiple

sequence

probabilities.

Bioinformatics 26:1958-1964

3. Durbin R, Eddy SR, Krogh A, Mitchison G

(1998) Biological sequence analysis: probabi-

listic models of proteins and nucleic acids.

Cambridge University Press, Cambridge

4. Miyazawa S (1995) A reliable sequence align-

ment method based on probabilities of residue

correspondences. Protein Eng 8:999-1009

5. Thompson JD, Koehl P, Ripp R, Poch O

(2005) BAliBASE 3.0: latest developments of

the multiple sequence alignment benchmark.

Proteins 61:127-136

6. Edgar RC (2004) MUSCLE: multiple sequence

alignment with high accuracy and high

throughput. Nucleic Acids Res 32:1792-1797

7. Sievers F, Wilm A, Dineen D et al (2011) Fast,

scalable generation of high-quality protein

multiple sequence alignments using Clustal

Omega. Mol Syst Biol 7:539

8. Chang JM, Di Tommaso P, Taly JF et al (2012)

Accurate multiple sequence alignment of

transmembrane proteins with PSI-Coffee.

BMC Bioinformatics 13:S1

9. Deng X, Cheng J (2011) MSACompro: protein

multiple sequence alignment using predicted

alignments. Bioinformatics

14:

407-422

13. Notredame C, Higgins DG, Heringa J (2000)

T-coffee: a novel method for fast and accurate

multiple sequence alignment. J Mol Biol 302:

205-217

14. Do CB, Mahabhashyam MS, Brudno M et al

(2005) ProbCons: probabilistic consistency-

based multiple sequence alignment. Genome

Res 15:330-340

15. Liu Y, Schmidt B, Maskell DL (2009) MSA-

CUDA: multiple sequence alignment on graphics

processing units with CUDA. 20th IEEE inter-

national conference on application-specific sys-

tems, architectures and processors, pp 121-128

16. Thompson JD, Higgins DG, Gibson TJ

(1994) CLUSTALW: improving the sensitivity

of progressive multiple sequence alignment

through sequence weighting, position-specific

gap penalties and weight matrix choice.

Nucleic Acids Res 22:4673-4680

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home