Collective Classification for Spam Filtering - Computational Intelligence in Security for Information Systems

Information Technology Reference

In-Depth Information

Application of the Generic Feature Selection Measure in

Detection of Web Attacks

Hai Thanh Nguyen 1 , Carmen Torrano-Gimenez 2 , Gonzalo Alvarez 2 ,

Slobodan Petrović 1 , and Katrin Franke 1

1 Norwegian Information Security Laboratory,

Gjøvik University College, Norway

{hai.nguyen,katrin.franke,slobodan.petrovic,}@hig.no

2 Instituto de Física Aplicada,

Consejo Superior de Investigaciones Científicas, Madrid, Spain

{carmen.torrano,gonzalo}@iec.csic.es

Abstract. Feature selection for filtering HTTP-traffic in Web application

firewalls (WAFs) is an important task. We focus on the Generic-Feature-

Selection (GeFS) measure [4], which was successfully tested on low-level

package filters, i.e., the KDD CUP'99 dataset. However, the performance of the

GeFS measure in analyzing high-level HTTP-traffic is still unknown. In this

paper we study the GeFS measure for WAFs. We conduct experiments on the

publicly available ECML/PKDD-2007 dataset. Since this dataset does not target

any real Web application, we additionally generate our new CSIC-2010 dataset.

We analyze the statistical properties of both two datasets to provide more in-

sides of their nature and quality. Subsequently, we determine appropriate in-

stances of the GeFS measure for feature selection. We use different classifiers

to test the detection accuracies. The experiments show that we can remove 63%

of irrelevant and redundant features from the original dataset, while reducing

only 0.12% the detection accuracy of WAFs.

Keywords: Web attack detection, Web application firewall, intrusion detection

systems, feature selection, machine learning algorithms.

1 Introduction

Web attacks pose many serious threats to modern Internet. The number of Web at-

tacks is steadily increasing, consequently Web application firewalls (WAFs) [8] need

to be more and more effective. One of the approaches for improving the effectiveness

of WAFs is to apply the feature selection methods. Achieving reduction of the num-

ber of relevant traffic features without negative effect on detection accuracy is a goal

that greatly increases the available processing time of WAFs and reduces the required

system resources. As there exist many feature selection algorithms (see, for example

[2,3]), the question that arises is which ones could be applied in intrusion detection in

general and in Web attack detection in particular. The most of the feature selection

work in intrusion practice is still done manually and the quality of selected features

depends strongly on expert knowledge. For automatic feature selection, the wrapper

Search WWH ::

Custom Search

Home