Identi fication of Pockets on Protein Surface to Predict Protein–Ligand Binding Sites - Identification of Ligand Binding Site and Protein-Protein Interaction Area

Biology Reference

In-Depth Information

2.3

MetaPocket Approach

There are two versions of metaPocket approach, MetaPocket1.0 and MetaPocket 2.0.

MetPocket1.0 was developed in 2009 and it only contained four methods and the web-

server is at http://metapocket.eml.org ( Huang 2009 ). MetaPocket2.0 is an extension

of metaPocket1.0 and contained four more methods developed between 2009 and

2010, recently published in the Bioinformatics journal (Zhang et al. 2011 ) . Here we

only mainly describe it as metaPocket since there is no big difference between version

1.0 and 2.0, except that four more methods are included in version 2.0.

In this section we will describe the algorithm and workflow for MetaPocket in

details. In a word, MetaPocket is a comprehensive method in which the predicted

sites from eight methods: LIGSITE cs , PASS, Q-SiteFinder, SURFNET, Fpocket,

GHECOM, ConCavity and POCASA are combined together to improve the protein-

ligand binding prediction success rate. These eight methods are chosen because

their developers provide source codes or executable binary and web-server available

freely. MetaPocket proceeds in three steps to work: calling all single methods, meta-

pockets generation and potential ligand-binding residue mapping. MetaPocket takes

a standard PDB file as input, and outputs the prediction pockets and also the predic-

tion pockets of all the successfully running single methods, and the potential ligand-

binding residues around each meta-pocket. The whole workflow of metaPocket is

illustrated in Fig. 2.3 and each step is explained in details as below.

Calling all single methods. In this step, the input protein structure is sent

to all the single methods in parallel and separately to save total running time. For

LIGSITEcs, PASS, SURFNET, GHECOM, Fpocket and ConCavity, their executable

binary programs are run locally to do the prediction. For Q-SiteFinder and POCASA,

python scripts are implemented to submit the protein structure to their web servers

and the results are retrieved from the remote servers automatically. Thus these two

methods depend on internet connection or the status of their web-servers and could

fail sometimes due to bad connection and showdown of web-servers. As results,

LIGSITE cs , PASS and SURFNET output different clusters of grid points and the

mass center of these clusters is used to represent the pocket site. For the other five

methods, pocket sites are indicated by clustered probes. Thus, the mass center of

each cluster is calculated and then is used as the representative point of the identified

pocket sites. As we note that, each identified pocket site from every method is

ranked by different scoring functions, either by the number of grid points or by the

size of cluster. Thus, we can not directly compare the rankings among each pocket

from different methods. To make them comparable, the z-score is calculated

separately for each site in different methods according to Formula 2.1. This z-score

will be used later as final scoring function in metaPocket method.

XX

−

Z e

−

=

i

(2.1)

i

s

Identification of Ligand Binding Site and Protein-Protein Interaction Area

Search WWH ::

Custom Search

Home