Biology Reference
In-Depth Information
sparse matrices, the memory footprint is still large for large-scale
datasets. Hence, we recommend the use of a computer with
a large amount of shared memory to compute MSA of large-
scale data. In the future, we plan to design an out-of-core
pairwise probability matrix computation by storing all matrices
on disk.
3. In our program, the profile-profile progressive alignment stage
has not yet been parallelized. Hence, for large-scale data,
this stage might become a parallel scalability bottleneck when
using multiple threads. In MSA-CUDA [ 15 ], a dynamic
scheduling parallelization has been proposed to parallelize the
profile-profile progressive alignment stage of ClustalW [ 16 ]on
graphics processing units. This method is also suitable for
parallelization based on multi-threading on CPUs. Hence,
our future work also includes the parallelization of this stage.
References
1. Feng DF, Doolittle RF (1987) Progressive
sequence alignment as a prerequisite to correct
phylogenetic trees. J Mol Evol 25:351-361
2. Liu Y, Schmidt B, Maskell DL (2010)
MSAProbs: multiple sequence alignment
based on pair hidden Markov models and
partition function posterior
secondary structure, solvent accessibility, and
residue-residue contacts. BMC Bioinformatics
12:472
10. Vingron M, Argos P (1989) A fast and sensitive
multiple sequence alignment algorithm. Com-
put Appl Biosci 5:115-121
11. Gotoh O (1990) Consistency of optimal
sequence alignments. Bull Math Biol 52:
509-525
12. Notredame C, Holm L, Higgins DG (1998)
COFFEE: an objective function for multiple
sequence
probabilities.
Bioinformatics 26:1958-1964
3. Durbin R, Eddy SR, Krogh A, Mitchison G
(1998) Biological sequence analysis: probabi-
listic models of proteins and nucleic acids.
Cambridge University Press, Cambridge
4. Miyazawa S (1995) A reliable sequence align-
ment method based on probabilities of residue
correspondences. Protein Eng 8:999-1009
5. Thompson JD, Koehl P, Ripp R, Poch O
(2005) BAliBASE 3.0: latest developments of
the multiple sequence alignment benchmark.
Proteins 61:127-136
6. Edgar RC (2004) MUSCLE: multiple sequence
alignment with high accuracy and high
throughput. Nucleic Acids Res 32:1792-1797
7. Sievers F, Wilm A, Dineen D et al (2011) Fast,
scalable generation of high-quality protein
multiple sequence alignments using Clustal
Omega. Mol Syst Biol 7:539
8. Chang JM, Di Tommaso P, Taly JF et al (2012)
Accurate multiple sequence alignment of
transmembrane proteins with PSI-Coffee.
BMC Bioinformatics 13:S1
9. Deng X, Cheng J (2011) MSACompro: protein
multiple sequence alignment using predicted
alignments. Bioinformatics
14:
407-422
13. Notredame C, Higgins DG, Heringa J (2000)
T-coffee: a novel method for fast and accurate
multiple sequence alignment. J Mol Biol 302:
205-217
14. Do CB, Mahabhashyam MS, Brudno M et al
(2005) ProbCons: probabilistic consistency-
based multiple sequence alignment. Genome
Res 15:330-340
15. Liu Y, Schmidt B, Maskell DL (2009) MSA-
CUDA: multiple sequence alignment on graphics
processing units with CUDA. 20th IEEE inter-
national conference on application-specific sys-
tems, architectures and processors, pp 121-128
16. Thompson JD, Higgins DG, Gibson TJ
(1994) CLUSTALW: improving the sensitivity
of progressive multiple sequence alignment
through sequence weighting, position-specific
gap penalties and weight matrix choice.
Nucleic Acids Res 22:4673-4680
Search WWH ::




Custom Search