Biomedical Engineering Reference
In-Depth Information
The raw continuity and discontinuity scores are converted into Z-scores, which
are then mapped onto the value range from 0 to 1 by calculating their cumulative
probabilities with respect to an assumed normal value distribution. The resulting
normalized continuity and discontinuity scores are combined to yield the SARI score
that balances their contributions:
1
SARI(
A
)
=
2
{
cont
norm
(
A
)
+
[1
−
disc
norm
(
A
)]
}
(16.4)
This scoring scheme was originally designed to globally classify SAR phenotypes
of compound data sets into three categories: continuous (high SARI scores), dis-
continuous (low scores), or heterogeneous SARs (intermediate scores around 0.5).
Global SAR heterogeneity might result from combinations of locally continuous and
discontinuous SARs (represented by different compound subsets/series) or from the
presence of SAR continuity near activity cliffs. The latter situation occurs when struc-
tural modifications of active compounds are possible without significant changes in
potency as long as critically important R-groups are conserved that satisfy essential
binding interactions (e.g., a functional group in a series of inhibitors that complexes
an ion in the active site of an enzyme).
16.2.3 Per-Compound Discontinuity Score
A variant of the SARI discontinuity score has been introduced to capture the contribu-
tions of individual compounds to SARdiscontinuity [10]. Acompound introduces sig-
nificant SAR discontinuity if its potency differs dramatically from that of its immedi-
ate structural neighbors. The per-compound discontinuity score is defined as follows:
potdiff(
i
,
j
)
×
sim(
i
,
j
)
{
j
|
sim(
i
,
j
)
>
0
.
65
,
i
=
j
}
disc(
i
)
=
(16.5)
|{
j
|
,
>
.
,
=
j
}|
sim(
i
j
)
0
65
i
Here potdiff is the potency difference between a pair of compounds and sim is their
calculated similarity. The raw score is obtained by comparing a compound to all
other molecules that are more similar to it than the predefined threshold
T
(here a
fingerprint Tanimoto similarity greater than 0.65). These scores are normalized by
using the individual scores of all compounds in the data set for Z-score calculation
[10]. According to this formalism, pairs of structurally very similar compounds
with large differences in potency obtain a per-compound score close to 1 and mark
prominent activity cliffs.
16.2.4 Structure-Activity Landscape Index
The structure-activity landscape index (SALI) score [11] is defined as
P
i
−
P
j
SALI(
i
,
j
)
=
(16.6)
1
−
sim(
i
,
j
)