Biology Reference
In-Depth Information
the Machine Learning Neural Networks Group, Department of Systems
and Computer Science, University of Florence, Italy.
Feature extraction with ANN. The ANN has been regarded as
a “black box” because its operation is neither fully explained nor fully
understood. The analysis of a fully trained ANN is in itself considered
a science. Nonetheless, compared to traditional statistical approaches,
the ANN has considerable strengths and advantages, including the
ability to: 1) model complex multidimensional and non-linear rela-
tionships, which makes the ANN technique ideal for processing bio-
logical information; 2) generalize and find relationships within the
data while tolerating erroneous or noisy data; and 3) be retrained with
expanded (or new) data sets, which is a plus for model refinement
requirements. In fact, an ANN with hidden nodes can extract from
input information the higher order features that are ignored by linear
models. During supervised training, the ANN is taught to map a set
of input patterns to a corresponding set of output patterns. Hidden
nodes form internal representation, which is reflected on the weight
values of the connections. Post-training analysis of the network
weights provide insight into the sensitivity and relevance of input fea-
tures or attributes for feature selection. New and powerful analytical
techniques for weight analysis, also known as contribution analysis,
have been developed. 135-138 Although interpretation of the weights of
the connections between artificial neurons requires careful considera-
tion, it is generally accepted that the greater the weight value in a con-
nection, the greater the importance of the parameter(s) linked to or
associated with that connection. This relationship is analogous to that
of the biological counterpart of the artificial neuron, where the
strength and number of synapses between biological neurons are
believed to be relevant to establishing a particular associative path. 110
In addition, graphical representations of the weights are very helpful
and are often the most common form of weight analysis. For exam-
ple, Holbrook et al. (1990) applied ANN to extract information
about the surface accessibility of protein residues from a protein
sequence using a database of high-resolution protein structures. 139
The ANN was trained to predict the water accessibility of a central
residue in the context of its flanking sequence. The protein sequences
Search WWH ::




Custom Search