Biomedical Engineering Reference
In-Depth Information
data must be treated with caution as there are no quality control processes in
place and numerous scientists have commented regarding the quality of the
data within PubChem [6-8]. Screening data are less rigorous than those in
peer-reviewed articles and contain many false positives [9]. Deposited data are
not curated, and so mistakes in structures, identifi er units, and other charac-
teristics can and do occur. The author of this chapter has frequently pointed
to the accuracy of some of the identifi ers associated with the PubChem com-
pounds [10-12], and an example will be given later in this chapter. The prob-
lems arise from the quality of submissions from the various data sources. There
are thousands of errors in the structure-identifi er associations due to this
contamination and this can lead to the retrieval of incorrect chemical struc-
tures. It is also common to have multiple representations of a single structure
due to incomplete or total lack of stereochemistry for a molecule [13].
22.2.2
Drug B ank
DrugBank [14] blends both bioinformatics and cheminformatics data and
combines detailed drug (i.e., chemical) data with comprehensive drug target
(i.e., protein) information. The database contains
2500
protein or drug target sequences that are linked to these drug entries. Each
DrugCard entry contains almost 100 data fi elds, with half of the information
being devoted to drug/chemical data and the other half devoted to drug target
or protein data. The database is fully searchable, supporting extensive text,
sequence, chemical structure, and relational query searches. DrugBank has
been used to facilitate in silico drug target discovery, drug design, drug docking
or screening, drug metabolism prediction, drug interaction prediction, and
general pharmaceutical education.
The group hosting DrugBank also hosts a series of other curated databases:
the Human Metabolome Database [15] contains detailed information about
small-molecule metabolites found in the human body and is used by scientists
working in the areas of metabolomics, clinical chemistry, and biomarker dis-
covery; FoodDB [16] is a comprehensive database providing information on
over 1900 food components, the list being taken from the U.S. Food and
Drug Administration (FDA) list of everything added to food in the United
States. The author of this chapter has reviewed the data within DrugBank,
and while efforts have been made to curate the data, there are numerous
examples of inaccurate chemical structures associated with particular com-
pounds and a distinct lack of expected stereochemistry for many of the chemi-
cal structures [13] .
>
4800 drug entries and
>
22.2.3
Sure C hem
SureChem [17] provides chemically intelligent searching of a patent database
containing millions of U.S., European, and World patents. Using extraction
heuristics to identify chemical and trade names and conversion of the extracted
Search WWH ::




Custom Search