Information Technology Reference
In-Depth Information
WEKA software [33] was used to train and classify the data sets, with various en-
sembles and classifiers. In the experimentation, cross validation is used with a value
of 10 K-fold. Tables 1 and 2 show the training and validation/classification perform-
ance of the applied models.
From Tables 1 and 2, it can be concluded that classification models are able to
carry out the classification of the two data sets and get the right classification for more
than 99.9% of the classes.
Table 3 shows the characteristics and options of the chosen ensembles, together
with their tuned values.
Table 3. Selected options of the ensembles for experimental study
Ensembles
Options
FilteredClassifier
Name of the filter “Discretize”
Number of boost iterations (10), seed for resampling (1), use
resampling instead of reweighting (false), percentage of
weight mass (100).
Adaboost
Number of boost iterations (10), number of sub-committees
(3), seed for resampling (1), use resampling instead of re-
weighting (false), percentage of weight mass (100).
MultiboostAB
Number of iterations (10), Size of each subSpace (0.5), seed
for resampling (1).
RandomSubSpace
Maximum size of a group (3), Minimum size of a group (3),
number of iterations to be performed (10), number of groups
(false), filter used “Principal Components”, percentage of
instances to be removed (50), seed for resampling (1).
RotationForest
Name of an attribute evaluator (CfsSubsetEval), name of a
search method (BestFirst).
AttributeSelectedClassifier
Size of each bag (100), compute out of bag error (False),
number of bagging iterations (10), seed for resampling (1).
Bagging
5 Conclusions and Future Work
This paper has proposed a mutation testing model for classifiers ensembles perform-
ing ID on numerical traffic data sets. It is aimed at assessing the generalization capa-
bility of the applied classifiers when confronting with zero-day attacks.
Experimental results show that the applied classifier ensembles properly deal with the
analyzed data, containing mutated network scans. It can then be concluded that the ap-
plied models are able to properly detect new attacks related with previously unseen scans.
Future work will be based on the mutation of some other attack situations and the
broadening of considered classifiers and ensembles.
Acknowledgments
This research has been partially supported through the project of the Spanish Ministry
of Science and Innovation TIN2010-21272-C02-01 (funded by the European
 
Search WWH ::




Custom Search