Biomedical Engineering Reference
In-Depth Information
In this formula, P means potency and sim represents similarity. Thus, SALI and the
per-compound discontinuity score are analogous formulations. However, the SALI
score is not normalized and therefore has an infinite value range. It is designed to
identify activity cliffs in a data set and provides the basis for the generation of SALI
graphs [11], an activity cliff-centric activity landscape model, as discussed below.
Considering the SAR analysis functions above, an important distinction to be
made is that the formalism underlying SAS maps and SARI are global in nature;
that is, these functions capture the SAR information content of an entire data set,
whereas the per-compound discontinuity score and SALI are local functions that
focus primarily on the identification of local SAR discontinuity and activity cliffs
(see below). It should also be noted that local functions cannot account for SAR
heterogeneity in a compound set because the detection of SAR heterogeneity requires
consideration of multiple compound subsets and hence the application of a global
scoring scheme.
16.3 PRINCIPLES AND INTRINSIC LIMITATIONS OF ACTIVITY
LANDSCAPE DESIGN
General characteristics of activity landscapes include that their design is data ori-
ented (especially tailored toward large compound data sets) and descriptive in nature
rather than predictive. Thus, activity landscape views should reveal SAR patterns in
compound data and provide an intuitive access to available SAR information. Accord-
ingly, landscape views should graphically represent and complement the results of
numerical SAR analysis. In addition, landscape models have also been introduced
that do not incorporate numerical SAR analysis schemes. Regardless, an important
aspect of activity landscapes is that they are not designed to facilitate QSAR-like or
binary activity predictions; rather, their focus is on data representation and knowl-
edge extraction. Accordingly, most activity landscape representations do not impose
a predefined SAR model on compound activity data.
If we adhere to the general definition of an activity landscape, as presented above,
we need to be concerned about three major aspects. These include the derivation of
similarity and activity/potency relationships and the integration of these relationships
within a graphical framework. This leads to a number of considerations concerning
chemical references spaces, the way molecular similarity is evaluated, and the assess-
ment and selection of experimental measurements. The latter aspect will also be of
particular importance for the evaluation of activity cliff distributions, as discussed
further below.
16.3.1 Chemical Reference Space
The design of chemical references spaces is critical for many chemoinformatics
applications, including compound activity predictions [6], and this also applies to
activity landscape modeling. However, as mentioned above, here the requirements
differ from space design for virtual screening and activity prediction. This is the
Search WWH ::




Custom Search