Chemistry Reference
In-Depth Information
fragments were inspected. Acyclic fragments were removed from the structure and stored
as side-chains. The remaining structure was stored as a scaffold. Using this algorithm, the
authors extracted 52 529 unique scaffolds and 4486 side-chains from OV and 15 690 scaf-
folds and 2851 side-chains from MB. Only a minor overlap was observed: 2945 scaffolds
and 407 side-chains occurred in both sets.
The ratios between the number of unique scaffolds and database size suggest that on
average one scaffold is found in 2.2 (OV) and 3.7 (MB) molecules, respectively. However,
the authors observed an unequal distribution of scaffolds: 8% (OV) and 7% (MB) of scaf-
folds occurred in 50%of the molecules. Moreover, more than 90%of the scaffolds occurred
only once or twice. Aromatic structures and heterocycles were found most. The distribution
of side-chains was similarly imbalanced. The 10 most frequent side-chains accounted for
almost 75% of occurrences, whereas the majority occurred only once. Among the top 10
were classic substitutions such as halogens, the nitro group, the hydroxy group and organic
functional groups such as the methoxy group. The methyl group accounted for 25% (OV)
and 20% (MB) of occurrences, respectively.
Xu [ 34 ] derived molecular scaffolds to evaluate chemical compound libraries in terms of
diversity, distribution in chemical space and differences/similarities with respect to exist-
ing drugs. The author used a Scaffold-based Classification Approach (SCA) that groups
compounds into the same class if they share the same topological scaffold or so-called
class center. The rationale behind this approach was that medicinal chemists intuitively
group compounds based on scaffolds and functional groups and not so much on structural
descriptors that most classification algorithms use. Scaffolds were derived similar to Xue
and Bajorath [ 31 ] and Bemis and Murcko. [ 28 ] However, unsaturated bonds connected to a ring
were considered part of the scaffold, since they change the chemical behavior of the ring
system. Normally, scaffold analysis overlooks aliphatic compounds, since scaffolds are
defined to consist of at least one ring. To overcome this, an extended definition of scaffold
was adopted that also covered the aliphatic compounds. Double and triple bonds of acyclic
compounds were treated as ring bonds, hence part of the scaffold. For saturated acyclic
compounds, the scaffold consisted of the heteroatoms and carbon atoms that connect them.
In all other cases, the carbon backbone formed the scaffold. Although the purpose of this
extended definition is to extract scaffolds from all possible compound classes, some com-
pounds from the same class may appear unrelated. For instance, amino acids that possess a
cyclic side-chain are separated from those with an aliphatic chain. The structural scaffold
derived will be the ring system in the first case and the characteristic amino/carboxyl group
core in the second case.
First, a list of unique scaffolds was derived and sorted by complexity. The complexity
was calculated from four structural descriptors, namely number of rings in the smallest set
of smallest rings, number of heavy atoms, number of bonds and the sum of heavy atomic
numbers in the scaffold. Each scaffold or class center in the list was assigned an ID that
corresponded to its position in the list. How much a molecule resembled its class center
was determined by the number of side-chains attached to the scaffold. Fewer side-chains
will give a closer resemblance to the class center. The similarity of a drug with the class
center was reflected in the membership value . The membership value was based on the sum
of heavy atomic numbers, the number of rotating bonds, the number of one and two nodes
and the number of double and triple bonds in a molecule compared with its scaffold. Since
the membership value indicated the contribution of rings in the class center for a certain
Search WWH ::




Custom Search