Information Technology Reference
In-Depth Information
An Interaction Pattern Kernel Approach
for Protein-Protein Interaction Extraction
from Biomedical Literature
Yung-Chun Chang 1,2 , Yu-Chen Su 2 , Nai-Wen Chang 1,3 , and Wen-Lian Hsu 1
1 Institute of Information Science, Academia Simica
No. 128, Sec. 2, Academia Rd., Taipei City 11529, Taiwan (R.O.C)
2 Department of Information Management, National Taiwan University
No. 1, Sec. 4, Roosevelt Rd., Taipei City 10617, Taiwan (R.O.C)
3 Graduate Institute of Biomedical Electronics and Bioinformatics
No. 1, Sec. 4, Roosevelt Rd., Taipei City 10617, Taiwan (R.O.C)
{changyc,hsu}@iis.sinica.edu.tw,
{b99705029,d00945020}@ntu.edu.tw
Abstract. Discovering the interactions between proteins mentioned in
biomedical literature is one of the core topics of text mining in the life sciences.
In this paper, we propose an interaction pattern generation approach to capture
frequent PPI patterns in text. We also present an interaction pattern tree kernel
method that integrates the PPI pattern with convolution tree kernel to extract
protein-protein interactions. Empirical evaluations on LLL, IEPA, and HPRD50
corpora demonstrate that our method is effective and outperforms several well-
known PPI extraction methods.
Keywords: Text Mining, Protein-Protein Interaction, Interaction Pattern
Generation, Interaction Pattern Tree Kernel.
1
Introduction
With a rapidly growing number of research papers, researchers have difficulty finding
the papers that they are looking for. Relationships between entities, mentioned in
these papers, can help biomedical researchers find the specific papers they need.
Among biomedical relation types, protein-protein interaction (PPI) extraction is
becoming critical in the field of molecular biology due to demands for automatic
discovery of molecular pathways and interactions in the literature. The goal of PPI
extraction is to recognize various interactions, such as transcription, translation, post
translational modification, complex and dissociation between proteins, drugs, or other
molecules from biomedical literature.
Most PPI extraction methods can be regarded as supervised learning approaches.
Given a training corpus containing a set of manually-tagged examples, a supervised
classification algorithm is employed to train a PPI classifier to recognize whether an
interaction exists in the text segment (e.g., a sentence). Feature-based approaches and
 
Search WWH ::




Custom Search