Homology Modeling: Generating Structural Models to Understand Protein Function and Mechanism - Computational Modeling of Biological Systems

Biomedical Engineering Reference

In-Depth Information

Tabl e 1 Some representative methods for the different steps involved in the construction of a

comparative structural model

Procedure

Server

Identify homologous sequences

BLAST [ 22 ], PSI-BLAST [ 23 ]

Protein family classifications

Pfam [ 21 ], InterPro [ 20 ]

Profile-based

HMMER [ 25 ], HHSearch [ 27 ], SAM [ 26 ]

Threading

C

profile-based

FUGUE (based on structure profile created by

HOMSTRAD) [ 32 ], PROSPECT [ 33 ],

SPARKS2 [ 34 ]andSP3[ 35 ]

Profile-based

secondary

structure prediction

C

PPA [ 76 ]

Meta servers

TASSER [ 77 ], I-TASSER [ 40 ], Bioinfobank [ 39 ]

Stereochemical quality control

Gaia [ 62 ], WHAT IF [ 61 ], PROCHECK [ 65 ],

MolProbity [ 64 ]

Estimating model quality

Qmean [ 55 ], QmeanClust [ 60 ]

machine learning approaches can be used to delineate domain boundaries in a

protein sequence and even identify the potential function of the identified domains.

InterProScan [ 20 ]andPfam[ 21 ] are two databases available online that one can

use to find the domains present in the query sequence. For some multidomain

query sequences, one may be able to find structural templates with similar domain

architecture, which will be the ideal scenario, but in others one may have to

model individual domains separately and look for experimental constraints to model

domain-domain orientations.

2.2

Direct Sequence Homology: BLAST and PSI-BLAST

BLAST (basic alignment and search tool) [ 22 ] is a powerful and efficient tool

to discover the evolutionary connections of a given protein sequence. Given a

protein sequence of interest, any current researcher will first and foremost employ

BLAST to search for homologs in all available sequence databases to uncover

the functional and evolutionary details of the protein sequence. In the context of

comparative modeling, BLAST helps in the identification of the structural template

on which to base the structural model for a given sequence. While using the protein

BLAST ( http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins ), one can specify

the sequence databases that should be searched; for comparative modeling, one

usually chooses the PDB. In the context of BLAST, the PDB sequence database

contains all the sequences that have an associated experimental structure. The match

between BLAST “hits” and a given sequence are described by three parameters:

similarity, coverage, and expect value (E-value). All three parameters are important

in selecting the best template for a given sequence. A minimum of 30% similarity

between query and template is essential for unambiguous alignment that can be used

for generating a homology model. For each domain, at least 70% sequence coverage

Computational Modeling of Biological Systems

Search WWH ::

Custom Search

Home