Biomedical Engineering Reference
In-Depth Information
neighborhood relationships in chemical space [40]. One of the strategies proposed to
minimize such dependence is to use multiple representations [43-45].
Several multidimensional data mining tools and visualization techniques are used
to analyze SAR, perform structural analysis, and in other applications for drug dis-
covery projects such as principal moments of inertia plots [46] and multifusion simi-
larity maps [41], which have been used widely [27,31,47,48]. Additional approaches
are hierarchical clustering, decision trees, multidimensional scaling, genetic algo-
rithms, neural networks, and support vector machines. These and other techniques
are reviewed elsewhere [27,42,49-51].
Two commonly used approaches to represent chemical spaces are self-organizing
maps [52] and principal components analysis (PCA) [53]. Figure 10.1 illustrates an
application of PCA to generate a visual representation of the ADME-related chemical
space or “ADMET space” of six compound collections: namely, approved drugs, nat-
ural products from TCM and ZINC [54], a large collection of in-house combinatorial
libraries, commercial vendor compounds, and a generally diverse collection obtained
from the National Cancer Institute database [55]. To generate the plots in Figure 10.1,
24 ADME-related properties computed using the programQikProp (QikProp, version
3.4, Schrodinger LLC, New York, 2011) were subject to PCA. The first two princi-
pal components account for 74.9% of the variance. Figure 10.1a shows the ADME
space of the combinatorial libraries; the other collections are in the background for
reference. Figure 10.1b compares the ADME space of the combinatorial libraries
and drugs showing that some of the combinatorial libraries occupy the same space as
the drugs while other compounds cover neglected regions of the drug-ADME space.
In contrast, diverse compounds in the National Cancer Institute database, commer-
cial vendor molecules, and natural products from ZINC occupy the same area of
drugs (Figure 10.1c to e). The latter observation is not surprising since compounds in
these three last collections are typically selected to be “drug-like”—at least similar
to currently known drugs. Figure 10.1f is a comparison of the ADME space of the
combinatorial library compounds, drugs, and natural products from TCM. Interest-
ingly, TCM covers a vast region of this property space, including unexplored areas
of drugs and the combinatorial libraries. In general, the combinatorial libraries cover
a large area shared with drugs and TCM.
10.4 CHEMOINFORMATIC-BASED ANALYSIS OF LIBRARIES USING
DIFFERENT REPRESENTATIONS
Molecular representation is at the core of chemoinformatic applications. There are two
major types of representation: graphs and descriptor vectors [37,56]. Graph methods
are used to perform structural and substructural analyses. These approaches are
straightforward to interpret and enable easy communication with medicinal chemists
and biologists, as illustrated elsewhere in the chapter. Representation using descriptor
vectors is commonly used in chemoinformatics for database processing, clustering,
similarity searching, and developing descriptive and predictive models of SAR: for
example, QSPR/QSAR models and activity landscape models [37]. Currently, more
Search WWH ::




Custom Search