Biology Reference
In-Depth Information
empirically: for example, penalties of 11 (open) and 1 (extend)
are recommended for BLOSUM62, whereas the suggested
values for PAM250 are 10 (open) and 1 (extend).
2. Multi-domain proteins . Proteins with multiple domains can be
a particular challenge for multiple alignment methods. When-
ever there has been an evolutionary change in the domain order
of the query protein sequences, or if some domains have been
inserted or deleted across the sequences, this leads to serious
problems for global alignment methods. Global alignment
methods are not suited to deal with permuted domain orders
and normally exploit gap penalty regimes that make it difficult
to insert long gaps corresponding to the length of one or more
protein domains. Therefore, it is advisable to align multi-
domain proteins using local multiple alignment methods.
MSA tools that are (partly) based on local alignment method
(for example T-COFFEE [ 6 ]) are good alternatives for this
kind of situation.
3. Repeats in protein sequences . The occurrence of repeats in many
sequences can significantly reduce the accuracy of MSA meth-
ods, mostly because the methods are not able to deal with
different repeat copy numbers. Sammeth and Heringa have
developed an MSA method that is able to perform global
MSA on protein sequences under the constraints of a given
repeat analysis [ 48 ]. This method requires the specification of
the individual repeats, which can be obtained by running one of
the available repeat detection algorithms, after which a repeat-
aware MSA is produced. Although the alignment result can be
markedly improved by this method, it is sensitive to the accu-
racy of the repeat information provided.
4. Preconceived knowledge . In a number of cases, there is already
some preconceived knowledge about the final alignment. For
example, consider a protein family containing a disulfide bond
between two specific cysteine (Cys) residues. Given the struc-
tural importance of a disulfide bond, Cys residues that form
disulfide bonds are generally conserved, so it is important that
the final MSA matches such Cys residues correctly. However,
depending on conservation patterns and overall evolutionary
distances of the sequences, it is sometimes necessary for the
alignment method to have special guidance in order to match
the Cys residues correctly. The main hurdle in this type of
alignment is in marking the positions of amino acids that have
to be correctly aligned and assigning specific parameters for
their consistency. The following suggestions are therefore
offered for (partially) resolving this type of problem:
(a) Chopping alignments . Instead of aligning whole sequences,
one can decide to chop the alignment in different parts.
Search WWH ::




Custom Search