Information Technology Reference
In-Depth Information
x NNpredict [21] is a program that predicts the secondary structure type for each residue
in an amino acid sequence by using a two-layer, feed-forward neural network.
Examples of hybrid methods are the programs PHD and PSIPRED. The program PHD
[17-19] uses a combination of multiple alignment and several cascading neural networks.
The program may generate its own alignment with the submitted sequence and is composed
of several cascading neural networks (previously trained on proteins of known structures).
PSIPRED [10] incorporates two feed-forward neural networks, which perform an analysis
on output obtained from PSI-BLAST [22]. PHD and PSIPRED are currently considered to
be amongst the best performing methods. They are both hybrid methods and this suggests
that it could be profitable to combine principles than to use one method [10,14].
1.3. Consensus secondary structure predictions
Different ways of combining prediction principles into a hybrid secondary structure
prediction program are known. There is a "standard approach" in which the most
appropriate strategy (or principle) is applied to a specific task. The predicting problem has
to be broken down into different tasks. For each task the best strategy is used to improve
the results. Another approach is "ensemble learning". Here the focus is on a single
prediction task and multiple predictors or classifiers are built for that task. The different
predictors are combined either by voting or by training a classifier to combine them.
A consensus method is using the last principle of ensemble learning to improve the
prediction results. The results of several secondary structure prediction programs can be
compared and combined by a classifier. Therefore in case of a secondary prediction
consensus method the multiple predictors are already built and predictions can be used to
make a consensus predicted sequence.
As mentioned before a consensus method looks at the results of several different
prediction programs. In order to choose when to use the results of which program(s) a
decision mechanism or classifier has to be implemented in the method. Three of those
consensus method classifiers are discussed below, i.e. decision tree, majority wins (winner
takes it all), and neural network.
A decision tree is a representation of a decision procedure in order to attain
classification for a given example [6]. At each node of the tree, there is a question, and a
branch corresponding to each of the possible outcomes of this question. At each leaf node,
there is a classification. Decision trees have many uses, particularly for solving problems
that can be formulated in terms of producing a single answer in the form of a class name.
Decision trees are constructed from examples that are already labelled. Decision trees could
be used to apply rules for determination of secondary structure for a specific residue. In fact
the next classifier could be viewed as a very short decision tree with few questions.
The consensus program JPRED [23,33,34] uses the majority wins principle. Despite
all the efforts and different methods, the Q 3 (percentage of correct prediction) of protein
secondary structure prediction for all the methods mentioned before is 60 to 80 percent. The
makers of a consensus secondary structure server called JPRED aimed to improve this
percentage by combining six different secondary prediction programs like the ones
mentioned before. The server is available through a web-interface and no neural network is
used in making the consensus prediction. JPRED builds a consensus prediction by
comparing the results of these programs and JPRED takes the predicted state, which is most
abundant. The majority wins and therefore this principle is also called the "winner takes it
all method". The correct prediction of protein secondary structure of JPRED is 72.9
percent.
Search WWH ::




Custom Search