Biomedical Engineering Reference
In-Depth Information
flat amide (representing H-bond donor) and TIP probe (representing molecular
shape descriptors). The model obtained using a threshold of 40
M showed the
best performance. It classified correctly 283 out of 343 nonblockers and 83 out of
152 blockers, with an overall accuracy of 74%. For the external test set composed of
66 compounds from the WOMBAT-PK database, the model achieved an overall
accuracy of 72%, with a correct prediction of 85% and 36% of blockers and
nonblockers, respectively. In an additional test using 1,877 compounds from the
PubChem database, the model correctly classified 107 out of 187 inhibitors and
1,271 out of 1,690 inactives.
Thai et al. [ 65 ] developed two Binary QSAR models for the prediction of hERG
blockers using two sets of descriptors, 32 P_VSA descriptors, and 11 relevant 2D
descriptors such as hydrophobic descriptors (SlogP, a_hyd, SlogP_VSA7,
Q_VSA_HYD, PEOE_VSA_HYD), diameter, atom counts (a_heavy), bond counts
(opr_nrot), subdivided surface areas (SMR_VSA5), as well as Kier and Hall
connectivity indices (chi1v_C, chi0_C). A dataset of 313 compounds collected
from the literature was divided into three classes based on the IC 50 value: class 1
with IC 50 <
m
1
m
M (low IC 50 ), class 2 with IC 50
10
m
M (high IC 50 ), and class 3
with IC 50 in the range 1-10
M. To generate the training and test sets, 184 2D
descriptors were calculated on the 313 molecules of the dataset and combined with
the pIC 50 to perform a diverse subset selection, which resulted in 240 compounds
for the training set and 73 structures for the test set. A second dataset was generated
removing the compounds containing carboxylic moieties. The best Binary QSAR
model with a cutoff at 1
m
M (MODEL I) was obtained using 11 relevant 2D
descriptors and removing the compounds with carboxylic groups from the training
and test sets. The model showed a total accuracy of 0.85 for the training set and 0.94
for the test set. The best Binary QSAR model with a threshold at 10
m
M (MODEL
II) was also based on the dataset without compounds containing carboxylic groups.
The model achieved a total accuracy of 0.83 for the training set and of 0.75 for the
test set, respectively. Due to the difficulty to correctly classify compounds with IC 50
values in the range of 1-10
m
M, new training and test sets were generated omitting
the molecules that belong to this class. The Binary QSAR based on 11 relevant 2D
descriptors (MODEL III) showed a total accuracy of 0.87 for the training set and of
0.93 for the test set. All three models were further tested with an external test set
of 58 compounds taken from the literature and showed a good performance, with a
total accuracy of 0.84, 0.78, and 0.86 for MODEL I, MODEL II and MODEL III,
respectively.
Counter-propagation neural network (CPG-NN) was used by Thai et al. [ 64 ]to
develop classification models using 285 compounds collected from the literature
and 2 sets of 2D descriptors, one based on 32 P_VSA descriptors and the other on
11 relevant descriptors. Based on the IC 50 values, the compounds were divided into
three classes: class 1 (IC 50
m
10
m
M), class 2 (10
m
M
IC 50 <
1
m
M) and class 3
(IC 50 <
M). The dataset was split into training and test sets by random division
(80:20 and 50:50), or by diverse subset selection (80:20 and 50:50). The best CPG-
NN classification performance, obtained with a 3D output layer combined with 11
selected 2D descriptors, reached a total accuracy of 0.93-0.95 for the training set
1
m
Search WWH ::




Custom Search