Chemistry Reference
In-Depth Information
Table 8.2 Some fragment pairs that occurred much more and much less often together than
expected. a
z -Value
Fragment 1
Fragment 2
Expected
occurrence
Real
occurrence
Multiple
O
(C)
OH
206
122
2292
19
(C)
206
88
2.3
117
CH 3
(C)
CF 3
(C)
544
139
0.26
-19
O
2653
270
0.10
-67
a The first row, consisting of the tetrahydrofuran and the -CH 2 OH group, would be expected to occur 122 times
together, but the pair appears in 2292 molecules leading to a multiple of 19 (see also text; data taken from Lameijer
et al. [27] ).
phenyl-containing compounds. The tetrahydrofuran ring often stems from the ribose moi-
ety of nucleosides, either natural or chemically modified, whereas the phenyl ring is often
found in industrial chemicals.
The authors suggest that the derived fragment and co-occurrence lists are useful in
creating newchemistry. For instance, these listings provide insight into themost popular and
therefore most commonly used side-chains and ring systems for synthesis. Rarer fragments
also come forward through these lists, indicating less explored parts of chemical space.
Finally, by looking at the fragments that do not occur together, new chemical space can be
explored. The co-occurrences may be used to find a replacement for a structural feature.
Examples of fragment pairs that are replacements of one another are chlorine and bromine
or naphthalene and benzene. [ 27 ] These fragment pairs rarely occur together, [ 27 ] possibly
because of their comparable physicochemical properties.
8.3.2 Analysis of Multiple Databases
To facilitate the design of libraries for high-throughput screening, Xue and Bajorath extrac-
ted scaffolds and side-chains and analyzed the distributions. [ 31 ] A'scaffold'was defined as a
molecular fragment without side-chains, essentially identical with the definition of frame-
works (Figure 8.6).A'side-chain'was defined as any acyclic chain or functional group with
a single connection point to the rest of the molecule. As a source, the authors used Opti-
verse (OV), [ 32 ] a combinatorial screening library designed for diversity, and the Maybridge
collection (MB), [ 33 ] a library of compounds used in medicinal chemistry. Acyclic struc-
tures were removed prior to screening (1214 from OV and 1060 fromMB). The remaining
sets were 116 762 (OV) and 58 239 (MB) compounds in size. To isolate scaffolds and
side-chains, ring structures were detected first. Starting from these rings, all connected
Search WWH ::




Custom Search