Biomedical Engineering Reference
In-Depth Information
The Search Process
Pursuing a solution to a molecular biology problem with bioinformatics methods invariably involves
significant backtracking, stepping, and jumping around from one database to the next. In support of
this typical work process, integrated information-retrieval systems have been created to provide a
mesh of "hard" or pre-computed links between the key online molecular biology databases. By far,
the most popular of these integrated systems is the National Center for Biotechnology Information's
Entrez, which includes many of the key molecular biology databases listed in Table 4-1 .
Table 4-1. Databases Included in the Entrez System.
Database
Description
PubMed
Biomedical literature.
Protein
Protein sequences from the Protein Information Resource (PIR), SWISS-PROT, Protein
Research Foundation (PRF), and Protein Data Bank (PDB), and from the translated
coding regions from DNA sequences in GenBank, the European Molecular Biology
Laboratory (EMBL) and the DNA Database of Japan (DDBJ).
Nucleotide
Nucleotide sequence data from GenBank, EMBL, and DDBJ, the Genome Sequence
Data Base (GSDB), and patent sequences from U.S. Patent and Trademark Office
(USPTO) and other international patent offices.
Structure
Experimental data from crystallographic and NMR structure determinations obtained
from the Protein Data Bank (PDB).
Genome
Views of genomes, chromosomes, contiged sequence maps, and integrated genetic
and physical maps.
PopSet
Aligned nucleotide and protein sequence data submitted as a set resulting from a
population, a phylogenetic, or mutation study.
OMIM
Human genes and genetic disorders.
Taxonomy
Names of all organisms represented NCBI's genetic database.
Books
A collection of biomedical topics.
ProbeSet
The Gene Expression Omnibus (GEO) gene expression and hybridization array.
3D Domains Protein domains from NCBI's Conserved Domain Database.
The Entrez system supports both inter- and intra-database linking. For example, not only are there
links between PubMed and the Nucleotide database and between proteins and the nucleotide
sequences from which the proteins were generated (see Figure 4-2 ), but there are BLAST-computed
links between all similar sequences within the Nucleotide database.
Figure 4-2. Entrez Database Integration. Entrez is a link-integrated search
system for accessing a growing number of linked molecular biology
databases. In addition to the major databases shown here, Entrez includes
PopSet, ProbeSet, and 3D Domains.
 
 
Search WWH ::




Custom Search