Biomedical Engineering Reference
In-Depth Information
active/inactive boundary of 1
M. Classification accuracy for the training set of
96% and 97.5%, and of 74% and 81% for the test set was achieved by the model
using the relevant and P_VSA descriptors, respectively. In detail the models
correctly classified 94% of the strong blockers, while the precision for classification
of the weak blockers decreased to 63% for the relevant descriptors model and to
74% for the P_VSA model. Noteworthy,
m
the misclassified molecules mainly
m
showed an IC 50 between 1 and 10
M.
In a study performed by Ekins et al. [ 51 ] recursive partitioning, Sammon
nonlinear mapping and Kohonen self-organizing maps were investigated with the
aim to analyze the performance of these techniques individually or in a consensus
approach. The recursive partitioning model was built using a training set of 99
compounds providing an r 2 of 0.90. Interestingly, the performance of the test set of
35 compounds was improved (from a r 2 value of 0.33-0.83) when the Tanimoto
index was introduced to filter the molecules according to their similarity to those
used in the training set. The Sammon nonlinear mapping and Kohonen self-
organizing maps models were generated using a dataset of 93 compounds and
8 descriptors selected with the PCA technique from more than 150 descriptors.
The eight descriptors selected are the Wiener index (a measure of molecular
branching), the topological Balaban index (provides information on the connectiv-
ity and branching of the molecule and is related to the hydrophobic interaction
of the molecule), number of H-bond donors, hydrophilicity index and electroto-
pological state indices (CH 2 , CH and
N, which provide information on the
topology, polarity and hydrogen bonding capabilities of the compound). These
descriptors suggest that the topology of the molecule plays an important role for
hERG inhibition. The 93 molecules of the training set were divided in three classes
based on their activity: class0 (IC 50 <
>
1
m
M), class1 (1
<
IC 50 <
10
m
M), and
class2 (IC 50 >
M). The analysis of the nonlinear map generated with the
Sammon nonlinear mapping technique revealed that the compounds of the classes
0 and 2 were mapped in two different areas. The compounds of class 1 were mapped
in a wide area of the map overlapping the areas occupied by class 0 and 2, resulting
in a poor prediction ability. The model predicts correctly 86% and 100% of the
compounds in the classes 0 and 2, giving an overall classification accuracy of 95%.
As happened for the Sammon nonlinear mapping, also in the map generated with
the Kohonen self-organizing map the molecules belonging to class 0 and 2 were
mapped in distinct areas, while the area occupied by compounds belonging to class
1 overlapped the sites of the other two classes. The method correctly classified 86%
and 79% of the compounds belonging to class 0 and 2, respectively. A consensus
analysis performed using the three methods resulted in 86% of the compounds
correctly classified in the classes 0 and 2. The consensus approach did not improve
the results obtained with the individual methods.
Doddareddy et al. [ 67 ] designed 24 binary models by using Linear Discriminant
Analysis (LDA) and SVMs to classify 2.644 compounds. Four molecular finger-
print descriptors belonging to the extended connectivity fingerprints (ECFPs) and to
the functional class fingerprints (FCFPs) were chosen. Four representative models
out of 24 were selected for further validation. The four classification models yielded
10
m
Search WWH ::




Custom Search