Biomedical Engineering Reference
In-Depth Information
Tabl e 1 Some representative methods for the different steps involved in the construction of a
comparative structural model
Procedure
Server
Identify homologous sequences
BLAST [ 22 ], PSI-BLAST [ 23 ]
Protein family classifications
Pfam [ 21 ], InterPro [ 20 ]
Profile-based
HMMER [ 25 ], HHSearch [ 27 ], SAM [ 26 ]
Threading
C
profile-based
FUGUE (based on structure profile created by
HOMSTRAD) [ 32 ], PROSPECT [ 33 ],
SPARKS2 [ 34 ]andSP3[ 35 ]
Profile-based
secondary
structure prediction
C
PPA [ 76 ]
Meta servers
TASSER [ 77 ], I-TASSER [ 40 ], Bioinfobank [ 39 ]
Stereochemical quality control
Gaia [ 62 ], WHAT IF [ 61 ], PROCHECK [ 65 ],
MolProbity [ 64 ]
Estimating model quality
Qmean [ 55 ], QmeanClust [ 60 ]
machine learning approaches can be used to delineate domain boundaries in a
protein sequence and even identify the potential function of the identified domains.
InterProScan [ 20 ]andPfam[ 21 ] are two databases available online that one can
use to find the domains present in the query sequence. For some multidomain
query sequences, one may be able to find structural templates with similar domain
architecture, which will be the ideal scenario, but in others one may have to
model individual domains separately and look for experimental constraints to model
domain-domain orientations.
2.2
Direct Sequence Homology: BLAST and PSI-BLAST
BLAST (basic alignment and search tool) [ 22 ] is a powerful and efficient tool
to discover the evolutionary connections of a given protein sequence. Given a
protein sequence of interest, any current researcher will first and foremost employ
BLAST to search for homologs in all available sequence databases to uncover
the functional and evolutionary details of the protein sequence. In the context of
comparative modeling, BLAST helps in the identification of the structural template
on which to base the structural model for a given sequence. While using the protein
BLAST ( http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins ), one can specify
the sequence databases that should be searched; for comparative modeling, one
usually chooses the PDB. In the context of BLAST, the PDB sequence database
contains all the sequences that have an associated experimental structure. The match
between BLAST “hits” and a given sequence are described by three parameters:
similarity, coverage, and expect value (E-value). All three parameters are important
in selecting the best template for a given sequence. A minimum of 30% similarity
between query and template is essential for unambiguous alignment that can be used
for generating a homology model. For each domain, at least 70% sequence coverage
 
Search WWH ::




Custom Search