Biomedical Engineering Reference
In-Depth Information
determined systematically on the basis of the matched molecular pair (MMP) formal-
ism [27]. An MMP is defined as a pair of compounds that are distinguished only by
the exchange of a single substructure (which might, for example, be an R-group or a
ring). Following this new landscape idea, all compounds in a data set that only differ
by a single substructure at a specific site are classified as a matching molecular series
(MMS). The BMMSG network is bipartite, that is, it contains two different types
of nodes. There are molecule nodes representing individual compounds (colored by
potency in analogy to NSGs) and set nodes (white) representing the invariant sub-
structure of an MMS. Molecule nodes are connected to a set node by an edge if they
belong to the MMS represented by the set node. Arrays of molecule nodes connected
to a set node can be represented as “supernodes” such that individual compounds
are indicated as colored squares within a large set node square (Figure 16.1), which
reduces the complexity of the graph. Characteristic node patterns emerging from a
BMMSG include, for example, SAR hotspots, which are supernodes displaying a
clear potency progression. Molecule and set nodes in BMMSGs are associated with
compound and invariant fragment structures, respectively, and edges with exchanged
substructures, which further aids in the chemical interpretation of the network. A
characteristic feature of BMMSGs is that compound data sets typically yield series
of disjoint subgraphs containing compound subsets with substructure relationships
that are distinct fromothers. Given its focus on structural interpretability, the BMMSG
data structure has also been adopted to rationalize the structural basis of mechanism
hopping in analogy to molecular mechanism-based NSGs [28].
16.4.4 Compound-Centric Activity Landscape Views
The activity landscapes discussed thus far represent complete data sets and provide
global or global vs. local views of SAR features. However, this is not a prerequisite for
landscape representation. It is also possible—andmeaningful for many applications—
to generate local landscape views that are focused on individual compounds and their
chemical neighborhood or, alternatively, on series of analogs.
16.4.4.1 Similarity-Potency Tree The similarity-potency tree (SPT) data structure
[29] (Figure 16.1) captures the immediate SAR environment of a chosen reference
compound, which is used as a root node of a tree structures in which edges connect
nearest neighbors. This means that an edge is drawn between a compound and the
compound to which it is most similar in the data set. Compounds are represented
as nodes and colored by potency. The radius of the structural environment around
the reference compound is determined by a predefined Tanimoto similarity threshold
value and the structural similarity of compounds relative to the root deceases along
the tree. Horizontal and vertical node patterns with well-defined potency progression
indicate the presence of interpretable SAR information. SPTs are generated system-
atically for all data set compounds (i.e., a data set comprising n compounds yields
n trees) and ranked by quantifying available interpretable patterns [29]. If a data set
contains SAR information, it can usually be appreciated by inspecting a limited num-
ber of highly ranked trees, which often capture overlapping local SAR environments
Search WWH ::




Custom Search