Biomedical Engineering Reference
In-Depth Information
length range between any pair of Chem2Bio2RDF entities to be identifi ed.
We have recently combined this with the BioLDA algorithm [17]
described below into an association search tool that shows, for any pair
of entities, the network paths between them that have the highest level of
literature support. This has proven useful particularly for suggesting gene
associations that can account for a drug's side effects or interactions with
a disease, which has led us to develop a modifi cation of the algorithm
that allows the search to be restricted to only those paths that contain a
particular type of entity (such as a gene).
We adapted a second algorithm from the social networking community
to bring scholarly publications into our networks. A database of recent
PubMed abstracts (for the last four years) was analyzed to identify
Bioterms, that is terms that can be associated with entities in our existing
network - for example, the name of a side effect, gene, compound, or drug.
These Bioterms constitute an association between a PubMed article and an
entry in one of our databases, producing an RDF association that can be
mined. These Bioterms were further applied to a modifi ed Latent Dirichlet
Allocation algorithm (a method of identifying latent topics - or clusters -
in a set of documents based on their word frequency) to identify latent
topics in the PubMed literature. Publications and entities can then be
probabilistically associated with these topics, and the product of multiple
associations over a path used to create a measure of distance between
entities (via topics) known as KL-divergence. The automatically identifi ed
topics along with their associated Bioterms show a surprising correlation
with real areas of study, such as psychiatric disorders. Our resultant
BioLDA algorithm [17] can be used for a variety of purposes, including
identifying previously unknown Bioterm connections between research
areas, constraining other searches to topic areas, and ranking of association
paths by literature support as implemented in our association search tool.
We are currently taking the association search and literature-based
methods a step further to provide quantitative measures of the association
between any two items. We have developed a Semantic Link Association
Prediction (SLAP) algorithm to provide such a quantitative measure
based on the semantics and topology of the network. We developed an
associated tool to provide both quantitative assessments of association
strength along with graphical descriptions of the association paths. Initial
studies, which will be published shortly, indicate a high rate of success in
missed-link predictions (essentially leave-one-out studies using the
network). Thus methods such as SLAP appear to be useful in predicting
associations that might not already be known in the scientifi c literature
or databases.
￿ ￿ ￿ ￿ ￿
 
Search WWH ::




Custom Search