Digging into IP Flow Records with a Visual Kernel Method - Computational Intelligence in Security for Information Systems

Information Technology Reference

In-Depth Information

Fig. 2. Accuracy, TPR and AUC results. The X axis represent the percentage of la-

belled instances. The precision of the model increases along with the size of the labelled

set.

Fig. 3. FPR results. The X axis represent the percentage of labelled instances. The

FPR decreases as the size of the labelled set increases. In particular, the best results

were obtained with a size of the labelled dataset of 40%.

Fig. 2 and Fig. 3 show the obtained results. In particular, we found out that the

greater the size of the labelled instances set the better the results. Specifically,

the best overall results were obtained with a training set containing 90% of

labelled instances. However, the results are above the 80% of accuracy and AUC

when only the 10% of the instances are labelled. These results indicate that we

can reduce the efforts of labelling software in a 90% while maintaining a accuracy

higher than 80%. However ,the FPR are not as low as they should be, with a

lowest value of 4% of false positives. Although for a commercial system this value

can be too high, due to the nature of our method, which is devoted to detect

new malware, this value is assumable.

Search WWH ::

Custom Search

Home