Rational Design of Viral Protein Structures with Predetermined Immunological Properties - Structure-Based Study of Viral Replication

Biology Reference

In-Depth Information

the Machine Learning Neural Networks Group, Department of Systems

and Computer Science, University of Florence, Italy.

Feature extraction with ANN. The ANN has been regarded as

a “black box” because its operation is neither fully explained nor fully

understood. The analysis of a fully trained ANN is in itself considered

a science. Nonetheless, compared to traditional statistical approaches,

the ANN has considerable strengths and advantages, including the

ability to: 1) model complex multidimensional and non-linear rela-

tionships, which makes the ANN technique ideal for processing bio-

logical information; 2) generalize and find relationships within the

data while tolerating erroneous or noisy data; and 3) be retrained with

expanded (or new) data sets, which is a plus for model refinement

requirements. In fact, an ANN with hidden nodes can extract from

input information the higher order features that are ignored by linear

models. During supervised training, the ANN is taught to map a set

of input patterns to a corresponding set of output patterns. Hidden

nodes form internal representation, which is reflected on the weight

values of the connections. Post-training analysis of the network

weights provide insight into the sensitivity and relevance of input fea-

tures or attributes for feature selection. New and powerful analytical

techniques for weight analysis, also known as contribution analysis,

have been developed. 135-138 Although interpretation of the weights of

the connections between artificial neurons requires careful considera-

tion, it is generally accepted that the greater the weight value in a con-

nection, the greater the importance of the parameter(s) linked to or

associated with that connection. This relationship is analogous to that

of the biological counterpart of the artificial neuron, where the

strength and number of synapses between biological neurons are

believed to be relevant to establishing a particular associative path. 110

In addition, graphical representations of the weights are very helpful

and are often the most common form of weight analysis. For exam-

ple, Holbrook et al. (1990) applied ANN to extract information

about the surface accessibility of protein residues from a protein

sequence using a database of high-resolution protein structures. 139

The ANN was trained to predict the water accessibility of a central

residue in the context of its flanking sequence. The protein sequences

Search WWH ::

Custom Search

Home