Biology Reference
In-Depth Information
Table 2.1 Summary of existing methods for protein-ligands binding site
prediction
Strategy
Geometry
Name
Grid
Sphere
a -shape
Energy
LIGSITE cs
SURFNET
PASS
Q-SiteFinder
Fpocket
POCASA
GHECOM
ConCavity
PocketPicker
CAST
SiteHund
The first eight methods are included in metaPocket
electrostatics energy terms and uses different types of probes to calculate interaction
energy. Table 2.1 briefly summarizes the category of these existing computational
methods.
In this chapter, we will focus on the grid-based method LIGSITE csc and a consensus
method metaPocket (Huang 2009 ; Zhang et al. 2011 ), which were developed in our
group. In the next sections, we will explain the detailed algorithm of LIGSITE csc and
metaPocket, then the performance of these methods with other methods will be
compared on the same test data-sets using the same evaluation criteria.
2.2
LIGSITE csc Approach
In our LIGSITE csc approach, we introduced two extensions based on LIGSITE:
First, instead of capturing protein-solvent-protein events, we capture the more
accurate surface-solvent-surface events using the protein's Connolly surface
(Connolly 1983 ), and not the protein's atoms. We call this extension LIGSITE cs
(cs = Connolly surface). Second, we re-rank the pockets identified by the surface-
solvent-surface events by the degree of conservation of the involved surface residues.
We call this extension LIGSITE csc (csc = Connolly surface and conservation).
The LIGSITE csc algorithm proceeds as follows. First, the protein is projected
onto a 3D grid (Fig. 2.1 ). In order to minimize the necessary grid size, we apply
principal component analysis so that the principal axis of the protein aligns with the
x-axis, the second principal axis with the y-axis, and the third with the z-axis. Such
rotation does not affect the quality of the results and it only minimizes the necessary
grid size. For each grid, we use a step size of 1.0 Å (grid space). Different grid spaces
have been tested as well. Second, grid points are classified into three categories:
“inside protein”, “near surface”, or “in solvent” using the following rules: a grid
 
Search WWH ::




Custom Search