Biology Reference
In-Depth Information
t_coffee -other_pg strike -aln sh3.aln -template_file sh3.
template
3. Multiple structure accuracy evaluation: iRMSD.
The most accurate scoring system provided by T-Coffee is the
iRMSD, which delivers a quantifiable and comparable score
using 3D structural information. The iRMSD [ 18 , 19 ]isan
RMSD-like measure independent from structure superposition
and thus unbiased by the process itself. The iRMSD is calcu-
lated using intramolecular distance matrixes (one per each
sequence with an available 3D structure), distances being cal-
culated between residues considered as equivalent as defined by
the MSA. The limitation comes from the need to have a closely
related structure for each sequence within the dataset; however
the user is free to choose the identity threshold to identify
homologous templates. The iRMSD is run using the following
command:
t_coffee -other_pg irmsd -aln sh3.aln -template_file sh3.
template
The template file corresponds to the explicit association
between a query sequence and a structure file. It is for instance
generated automatically when running Expresso. The output
of the iRMSD will give two scores, the iRMSD score as
described above and the NiRMSD, corresponding to the
iRMSD score normalized by the length of the sequences.
4 Notes
1. T-Coffee package installation on Mac OS X and different Linux
distributions have been heavily tested; however, it does not
preclude especially for beta versions of T-Coffee to encounter
installation problems. In such case, do not hesitate to contact
the T-Coffee developers ( tcoffee@googlegroups.com ).
2. T-Coffee package is designed to address biological problems
and as a consequence of the versatility of biological data, pro-
blems or limitations can occur. For this reasons, the T-Coffee
developers can be always contacted for any problem you might
encounter while using T-Coffee ( tcoffee@googlegroups.com ) .
3. T-Coffee alignments depend on the T-Coffee mode, the com-
plexity, and the size of your datasets, and thus can be quite
expensive in terms of memory and computation. All different
T-Coffee modes should not be used for dataset containing
more than 1,000 sequences, and no more than 200 sequences
when running structural mode. This limitation is of course
empirical and is no more than an indication.
Search WWH ::




Custom Search