Information Technology Reference
In-Depth Information
Fig. 4. Filtering process by '-restricted' as filtering keyword
other entities, the whole procedure becomes terminated without extracting any
HLA-disease relation. However, if filtering keywords appear closer to disease en-
tities (D) and relation keywords (A) than HLA entities, it loses its own filtering
function, and the information is extracted even though the sentence contains
filtering keywords.
Fig.4 shows filtering process by 'restricted' filtering keyword. Since filtering
keywords '-restricted' appears closer with HLA entity ('HLA-A*02') than other
entity ('hepatocellular carcinoma'), the sentence is filtered and searching algo-
rithm is terminated immediately.
Compared to the method simply identifying the relation by coexistence of the
HLA entities, disease entities and relation keywords, the tree search algorithm
has its own advantage in that it could use structural information of sentences
provided by Collins parser during filtering process. For example, the follow-
ing sentence will be filtered out with simple keywords matching since filtering
keywords 'specific' appear closer to HLA B27 than disease entities : 'Labora-
tory anomalies are not specific, HLA B27 antigen is not associated with this
syndrome'. However, 'specific' and 'HLA' belong to different sentences that is
connected with conjunction 'and' and in that case 'specific' do not have any
dependency with 'HLA B27'. The tree search algorithm of our system did not
filtered the sentence.
We randomly select 909 abstracts from PubMed using search keyword 'HLA'.
We carefully divided sentences in abstracts into three levels according to their
quantity of information. The sentences containing HLA and disease entities at
Search WWH ::

Custom Search