Information Technology Reference
In-Depth Information
Fig. 2. Accuracy, TPR and AUC results. The X axis represent the percentage of la-
belled instances. The precision of the model increases along with the size of the labelled
Fig. 3. FPR results. The X axis represent the percentage of labelled instances. The
FPR decreases as the size of the labelled set increases. In particular, the best results
were obtained with a size of the labelled dataset of 40%.
Fig. 2 and Fig. 3 show the obtained results. In particular, we found out that the
greater the size of the labelled instances set the better the results. Specifically,
the best overall results were obtained with a training set containing 90% of
labelled instances. However, the results are above the 80% of accuracy and AUC
when only the 10% of the instances are labelled. These results indicate that we
can reduce the efforts of labelling software in a 90% while maintaining a accuracy
higher than 80%. However ,the FPR are not as low as they should be, with a
lowest value of 4% of false positives. Although for a commercial system this value
can be too high, due to the nature of our method, which is devoted to detect
new malware, this value is assumable.
Search WWH ::

Custom Search