Information Technology Reference
In-Depth Information
Fig. 3. Histogram of average accuracy differences between SVMs and SVMs applied
to datasets with reduced dimensionality
4.2 Tournament Learning
In this series of experiments we focused on the 359 GO terms of the “F” branch.
To apply a tournament learning strategy the key idea was to determine all the
disjointed pairs of GO terms by examining whether proteins lie at the intersec-
tion. Then, each disjointed pair of GO terms can be learned by a binary classifier
such as SVMs.
The average number of disjointed GO term classifiers was 259.4. Thus, for the
tournament learning strategy we defined 46'564 binary classifiers corresponding
to all the disjointed pairs. The union of the training proteins associated to the
359 GO terms represented more than 7'131 proteins. We defined the score of a
predicted GO term as the proportion of classifiers predicting this GO term.
All the SVM predictors were applied to the independent dataset of 44 proteins
to produce a ranked list of predicted functions. Figure 4 illustrates the histogram
of matches with respect to their rank. With the first 50 ranked terms we had 108
matches, corresponding to average recall equal to 51.7% and average precision
equal to 4.9%. Therefore, with respect to the previous experiments we improved
the prediction performance.
The average rank is defined as the average of all the matched positions in the
list of predicted functions. The average of the average ranks was 83.7.
4.3 Multi-label Learning
Again, we focused on the 359 GO terms of the “F” branch. Based on ten repe-
titions we first carried out 10-fold cross-validation trials on the proteins of the
training set. We performed the computations with the Matlab software package
proposed in [16]. The feed-forward neural architectures had 33'102 neurons in the
 
Search WWH ::




Custom Search