Information Technology Reference
In-Depth Information
Macromolecular Structure Databases
Eric W. SAYERS and Stephen H. BRYANT
National Center for Biotechnology Information, National Library of Medicine
National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
Abstract The resources provided by NCBI for studying the three-dimensional
(3D) structures of proteins center around two databases: the Molecular Modeling
Database (MMDB), which provides structural information about individual
proteins; and the Conserved Domain Database (CDD), which provides a directory
of sequence and structure alignments representing conserved functional domains
within proteins (CDs). Together, these two databases allow scientists to retrieve
and view structures, find structurally similar proteins to a protein of interest, and
identify conserved functional sites. To enable scientists to accomplish these
tasks, NCBI has integrated MMDB and CDD into the Entrez retrieval system. In
addition, structures can be found by BLAST, because sequences derived from
MMDB structures have been included in the BLAST databases. Once a protein
structure has been identified, the domains within the protein, as well as domain
“neighbors” (i.e., those with similar structure) can be found. For novel data not
yet included in Entrez, there are separate search services available. Protein
structures can be visualized using Cn3D, an interactive 3D graphic modeling tool.
Details of the structure, such as ligand-binding sites, can be scrutinized and
highlighted. Cn3D can also display multiple sequence alignments based on
sequence and/or structural similarity among related sequences, 3D domains, or
members of a CDD family. Cn3D images and alignments can be manipulated
easily and exported to other applications for presentation or further analysis.
1. Overview
The Structure homepage 1 (Figure 1) contains links to the more specialized pages for each
of the main tools and databases, introduced below, as well as search facilities for the
Molecular Modeling Database (MMDB) [1]. MMDB 2 is based on the structures within
the Protein Data Bank (PDB) and can be queried using the Entrez search engine, as well
as via the more direct but less flexible Structure Summary search (see Figure 1). Once
found, any structure of interest can be viewed using Cn3D 3 [2], a piece of software that
can be freely downloaded for Mac, PC, and UNIX platforms.
Often used in conjunction with Cn3D is the Vector Alignment Search Tool
(VAST) [3, 4]. VAST 4 is used to precompute “structure neighbors” or structures similar
1 [http://www.ncbi.nlm.nih.gov/Structure]
2 [http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml]
3 [http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml]
4 [http://www.ncbi.nlm.nih.gov/Structure/VAST/vast.shtml]
Search WWH ::




Custom Search