Security Alert Correlation Using Growing Neural Gas (Network Security)

Abstract

The use of alert correlation methods in Distributed Intrusion Detection Systems (DIDS) has become an important process to address some of the current problems in this area. However, the efficiency obtained is far from optimal results. This paper presents a novel approach based on the integration of multiple correlation methods by using the neural network Growing Neural Gas (GNG). Moreover, since correlation systems have different detection capabilities, we have modified the learning algorithm to positively weight the best performing systems. The results show the validity of the proposal, both the multiple integration approach using GNG neural network and the weighting based on efficiency.

Keywords: Alert correlation, Neural networks, Intrusion detection, Growing neural gas.

Introduction

When an intrusion detection system (IDS) detects an attack – or any other malicious activity – an alert is reported to the system administrator. However, an attack will rarely occurs in isolation, but belongs to a higher scenario composed of a series of attacks [1]. The logical connection between several alerts belonging to the same scenario, the large number of alerts that makes it impossible to manual processing and the tendency to include in the analysis alerts from other security systems different than the IDS, are three of the reasons for the use of alert correlation mechanisms [2].

The correlation methods are generally classified mainly into three types: specification of scenarios, which defines the whole scenarios using an attack description language and modeling the correlation process as a pattern recognition problem [3]; another approach defines the prerequisites and consequences of each individual attack and, in the correlation process, the approach relates consequences of a previous attack with prerequisites of a subsequent [4]; and finally, the clustering approach is based on finding similarities or relationships between attributes of the alerts, so that the alerts with similar values on their attributes belong to the same class, and therefore, to the same scenario [2].

Each of the above approaches has different features. For example, the first approach can efficiently detect known scenarios, but is unable to correlate new scenarios. By contrast, clustering methods can detect unknown attacks, but produce a high volume of false positives. For this reason, the literature proposes the integration of pairs of methods in other to achieve better performance [3], [4].

This work proposes an approach that integrates or combines the results of multiple correlation methods, not just a pair. This multiple integration was performed using an artificial neural network (ANN), in particular, the network Growing Neural Gas (GNG) [5]. We use the GNG network due to its clustering capabilities. Moreover, we have taken into account the performance or efficiency of the methods, with the aim that the integration process is conditioned by the best methods.

Having reviewed the state of the art (ch. 2) in related subjects, our proposal is showed, which covers the metric to evaluate the correlation systems and the integration method (ch. 3); subsequently, (ch. 4) a test scenario is built using several correlation systems and the GNG, and the proposal evaluation is showed; finally, (ch. 5) the main conclusions deriving from research, as well as appropriate lines for future investigation are presented.

Related Works

ANN is one of the most widely used techniques in the IDS, because the neural networks have shown themselves to be powerful classifiers with tremendous generalization and learning ability. On the other hand, the use of ANN for IDS are based on their flexibility and adaptation to natural changes which may occur in the environment, and particularly to the ability to detect patterns of unknown attacks [6].

Unsupervised learning techniques like Self-organizing map (SOM) algorithm have been used to cluster the content of the network packets. Others like multilayer percep-tron (MLP) with backpropagation learning algorithm has been used to recognize host attacks, and its analysis is based on both logs and system calls [7].

The research carried out by [8] presents a neural network-based intrusion detection method for the internet-based attacks on a computer network. In particular, feedforward neural networks with the back propagation training algorithm were employed in this study. The experimental results on real-data showed promising results on detection and prediction of intrusions.

In [9] an integrated IDS using multiple ANN is developed. The approach used in this work include the combination of two component neural networks, growing neural gas and self-organizing map. An important feature of this system is that it can be adapted to both anomaly and misuse detections for intrusive outsiders.

In the work carried out by [6] nine IDS based on ANN were implemented and tested with several experiments and topologies. An important result of this research is that, in average, the neural networks provided very good results, in some cases, detection rates of 99,60% are achieved.

The specification of scenarios correlation methods define the whole scenarios using an attack description language and model the correlation process as a pattern recognition problem [10], [11]. These systems have very favorable results in terms of detection capabilities, they have a high probability of recognizing the scenarios stored in the database and rarely produce false positives. However, they have limitations such as time needed to encode the scenarios and, above all, their inability to detect new scenarios not specified in the database [3].

In the approach of prerequisites and consequences of each individual attack relates consequences of a previous attack with prerequisites of a subsequent attack [1], [4], [12]. Such systems have the advantages of requiring less time to define the preconditions and consequences, and have some ability to detect small variations of well-defined scenarios. On the contrary, has the disadvantage of not detecting new scenarios or large variations and, also, this approach presents the problem of false positives.

The clustering approach looks for similarities or relationships between attributes of the alerts, so that those having similar or related values on their attributes belong to the same class and, therefore, at the same scenario [2], [13]. Clustering methods have the advantage of detecting new and unknown scenarios. However, they have the problem of obtaining a very high rate of false positives.

Using the approach of integration of two methods, [3] develops a system that complements a prerequisites and consequences correlation engine with other system based on statistical analysis (clustering). The main conclusion of this work is that it improves the performance respect to another paper by the same authors, which they only use clustering [2].

Finally, [14] also uses an integrative approach of prerequisites and consequences with clustering techniques, in particular, the work employs a Bayesian network and a probabilistic causality test. As in the previous case, the results are better than those obtained in previous work [4], which only used the probabilistic method.

Correlating IDS Alerts Using GNG

Our objective in this paper is to propose a new approach that joins together an extended view of two of the ideas revised in the previous section. On the one hand, we use the approach of integration, but to combine the results of multiple different correlation methods, not just a pair as in previous works. In addition, our novelty lies in the fact that the integration method is general, not ad-hoc as in previous papers. On the other hand, we use a neural network (GNG neural network) for grouping and correlating alerts, but we take into account the quality of these alerts to balance the learning process with the aim that the final result is conditioned by the best methods.

Fig. 1. Intrusion detection general model

In this paper, we will focus on the method used to perform alert correlation by DIDS. However, this correlation component is part of a generic intrusion detection model shown in Figure 1. The perception phase monitors the computer network and performs intrusion detection at low level through local IDS, for example, Snort sensor. The correlation component is divided into two phases, the first is the correlation itself through multiple correlation methods (not just one or a pair) and the second to integrate the results obtained by different methods above. Finally, the response phase which aims to act on the network when any attack occurs, reconfiguring a firewall, closing ports or any other method of active response.

Performance Measurement

Since we will modify the learning algorithm of the GNG neural network based on the quality of correlation methods, first, it is necessary to establish a quality measure. The weighting of each correlation method in the integration process is based on the evaluation of their performance, which is essential in the field of intrusion detection. This evaluation will focus on measuring the effectiveness of different systems in terms of its ability to classify or correlate properly. Therefore, we must have a metric to evaluate and compare the quality of each correlation methods objectively [15]. Of course, the quality measure of each method may vary over time depending on its successes and failures.

Generally, the most frequently used indices are the true positives rate (TP-True Positive) and the false positives rate (FP-False Positive). TP and FP rates are often used combined using ROC curves (Receiver Operating Characteristic). In our case, we use a measure defined in [15] called intrusion detection capability (CID), because its calculation is based on previous rates and this metric is an objective measure that returns a real value directly comparable.

Correlation Process

The correlation process has been divided into two phases identified in the general model of Figure 1: correlation phase and integration phase. The correlation phase in which low-level alerts generated by local IDS are related by multiple local correlation methods. We can use any number of different correlation techniques, not just a pair as in traditional approaches. The output of any correlation method will be an alert in the standard format for the exchange of information between detection systems called IDMEF (Intrusion Detection Message Exchange Format) [16].

The second stage is the integration phase which receives as input IDMEF alerts generated by the correlation methods in the previous phase. The integration will combine these alerts, and scenarios in the highest abstraction level will be obtained. To perform the integration, we have been used a clustering method, the same idea that the correlation clustering approach but operating in a higher abstract level.

We have used GNG neural network as integration algorithm due to its clustering capabilities and its ability to learn new scenarios without retraining the network with all the above. Input features to the network will be the fields of IDMEF alerts, examples of these fields are the source IP address of the attack, destination IP address, source port, destination port and time. It is important to consider that our integration method is general (GNG neural network algorithm), not ad-hoc as in previous works.

However, we propose a minor modification of the GNG learning algorithm [5], using the CID measure (to prioritize, in the integration process, the more reliable and efficient methods). These methods will have more weight in the final result. Specifically, we modify the adaptation criteria of the reference vectors of winner neuron and its neighbors with respect to the input patterns, using the CID measure of each correlation method. So, the increase of reference vectors of the winning neuron and its neighbors will be as great as the performance measure of the correlation method associated to the input pattern.

Thus, the neurons of network will be closer to the greater quality input patterns. Finally, the final map will have learned mainly patterns of correlation methods with great ability to detect but unable to correlate new scenarios. But the map also will have learned patterns that can detect unknown attacks, as well as the inherent capacity of neural networks to generalize and recognize new patterns from other previously observed. Therefore, the neural network can detect known and new scenarios

Tests and Results

We built our test scenario in order to evaluate the outcome of our approach and compare it with other proposals. The Figure 2 shows the test scenario developed.

We have used Snort sensor as local IDS, possibly the world’s most widely used IDS. In the correlation stage, because the approach allows multiple systems, we have deployed three systems: alertSTAT, a tool developed by the University of California, this tool belongs to the specification of scenarios type; PreCons module, that implements the prerequisites and consequences approach defined in [4], EMERALD [17], one of the well-known intrusion detection monitors, which perform the correlation process by clustering. GNG neural network has been used in the integration phase and a response module that basically generates reports.

For the tests are uniform, we have used as input characteristics to the GNG neural network the same attributes used in [17], addresses and ports of source and destination, attack class and time. Moreover, the learning parameters of the network have been A = 2000, e1 = 0.1, e2 = 0.01, a = 0.5, £ = 0.005.

In order to validate the modification of the GNG algorithm, we must consider that the tests have been conducted both with the original algorithm and the modified algorithm. In addition, in order to obtain results that can be compared with other proposals, we need to use a standardized test data. To date, DARPA intrusion detection evaluation data is the most comprehensive set known to be generated for the purpose of evaluating the performance of any given IDS [18].

Fig. 2. Test scenario

Fig. 3. Results of the test

As we can see in Figure 3, on average, PreCons shows the worst results, about 60% of correlation rate, while the probabilistic method (EMERALD) behaves better, correlation rate between 60% and 80%. The integration approach of pairs of methods (PreCons + EMERALD) improves the results of the previous two, but its performance is worse than our multiple integration approach. AlertSTAT showed a performance above 80% in all cases, very acceptable and predictable result. Finally, the integration approach of the three previous correlation methods achieves the best results, the performance is better in the modified GNG algorithm. The integration show rates over 90%, even close to 100% by modified GNG.

Table 1. CID value of the evaluated methods

Method	Cid Value
Modified GNG	0.8787
GNG	0.8158
AlertSTAT	0.6311
EMERALD	0.2818
Pre-Cons	0.2100

Moreover, the false positive rate shows that AlertSTAT, GNG and modified GNG have not errors (there are three overlapping bottom lines). PreCons approach has a low false positive rate and EMERALD and the integration of pairs of methods have an excessive number of false positives.

When ROC curves intersect or overlap is difficult to determinate the best method, in order to compare through a real value the results obtained by different methods, we have used a measure of quality named CID [15] which is defined as follows:

The ratio between the mutual information of input and output (I(X;Y)) and the input entropy (H(X)). Mutual information measures the reduction of uncertainty of the IDS input by knowing de IDS output, this measure is normalized using the entropy (the original uncertainty) of the input. Table 1 shows the value of CID obtained by each of the tested approaches. The modified GNG neural network gets the best value.

Conclusion

This work present a method for the correlation of intrusion detection alerts based on the use of multiple correlation methods and the integration of its results. For this end, the ANN GNG has been used. The learning algorithm of the GNG network has been modified so that the best correlation methods weight the final result.

The results show that, the integration of multiple methods improves the performance obtained by each of the correlation methods alone. Moreover, the integration using the modified GNG algorithm has improved the performance of the classic version. Although the two versions obtain rates over 90%, in the case of the modified GNG are close to 100%.

We are currently working in the improvement of the proposed method to achieve the ability to be proactive, so that the system detects early stages of the attack scenarios with some probability. Moreover, we are evaluating new versions of self-organizing neural networks that open new ways to improve the performance. Finally, due to lack of real scenarios in the DARPA data set, we are working to validate the approach in real scenarios randomly generated.