Information Technology Reference
In-Depth Information
mechanisms and complexity. The network architecture of type ART1 self-organises
and self-stabilises its recognition codes and categorises arbitrarily many and
arbitrarily complex binary input patterns [20]. We obtain the input patterns for ART1
from gene expression micro array data of different samples of the same disease by
using binary coding. As result of ART1 analysis we get a specific pattern of together
expressed genes, which shall be deemed to be typical in general for a considered
disease. Such a resulting gene pattern is one of the integral parts essential necessary
for generating genetic networks.
By the way there are other interesting bioinformatic applications of neural networks
discussed in the biomedical literature as for instance an approach for classifying
nursing care needed, using incomplete data [21], for detecting periodicities in the
protein sequence and increasing in this way the prediction accuracy of secondary
structure [22] or predicting drug absorption using structural properties [23].
4.3 Text Mining
Mining Causal Relationship between Genes from Unstructured Text. Using the
method described in 5.2. a subset of genes is classified by the neural network for a
special biological context, e.g. for a considered disease. As we want to automatically
construct a causal genetic network, the next type of information we need concerns
causal relationships between classified genes.
One of the richest sources of knowledge nowadays is the internet. This is especially
true for the biological domain and within this domain for the field of genomics. A
huge amount of data is now available to the public. Much of this data is stored in
publicly available databases. Therefore it is reasonable to integrate this knowledge
into the construction of genetic networks. A straight forward approach is to find
databases which contain the type of information we are looking for. For our work we
found the appropriate information in the GeNet database. We designed and
implemented a tool that consists of three sequential working components: first a
database adapter that connects to the internet database GeNet, queries the data and
stores all query results locally on the computer. A parser tool analyses the stored data
and extracts the wanted information. In the last step a filter tool searches for data
redundancy and inconsistency and prepares resulting data with gene relation
information for import into the software system for generating and presenting genetic
networks.
But a lot of specific knowledge is not available in such a structured form. It is
distributed somewhere in the net and it is presented in unstructured text. In our case
relationships between genes are not available in special databases but it may be found
in the biomedical literature. Most of these articles are available online. One of the key
databases for publications in this field is the PubMed database. PubMed contains over
11 million abstracts today and approximately 40,000 new abstracts are added each
month. To use this source of information we have to deploy more sophisticated
methods than those described above. One way to integrate this knowledge into the
analysing process of the micro array data automatically is the usage of techniques of
Information Extraction (IE). In this paragraph we first give a definition of IE. Than
we will focus on the problems to deal with when applying IE to the biological
Search WWH ::




Custom Search