Anomaly Detection in the Cloud: Detecting Security Incidents via Machine Learning - Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Information Technology Reference

In-Depth Information

Unfortunately, probabilities and patterns of unwanted behaviour are very hard

to procure and labeled training data for a new system is sparse [8, 9]. But it is

reasonable to assume that most activity in a network is not triggered by com-

promised machines and attacks are represented by only a tiny fraction of the

overall behaviour. Therefore, methods provided by unsupervised learning yield

outliers, which in turn may represent attacks [9-11]. Unsupervised learning can

roughly be classified in, nearest-neighbour, rule-mining, statistical, and cluster-

ing techniques. Each of which have advantages and disadvantages, depending on

how they are used, see Chandola et al. [11]. For our purpose of grouping anoma-

lous instances, clustering seems best suited. The disadvantages of clustering,

i.e. the complexity of clustering algorithms and possible misclassifications, can

be reduced by leveraging optimized algorithms, assumptions, and false-positive

reductions [9, 12].

Both methods, signature-based and anomaly-based, have strengths and weak-

nesses. The main drawback of signature-based methods is the inherent limitation

that they always have to consult the signature database to match detected fea-

tures with the information therein [9]. If a new attack is out, it is probable that

the signature database does not contain the latest attack pattern. Anomaly-

based detection techniques, on other hand, have their true potential in detecting

previously unseen patterns [8]. A common limitation both detection techniques

share is a lack of “context”. This context needs to provide information about

inherent relations among users, services they use, the hosts from which they

operate, and for which workflow they are assigned to. For instance, it is not

sucient to know that a service has longer than average response time, the

correlation of response time and measurable changes of user and network host

behaviour offers more valuable clues.

In order to get benefits from signature- and anomaly-based monitoring we

propose to combine them into a context-based anomaly detection framework.

This framework consists of three main tiers:

i The specification of a DSL which allows to model the cloud-sourced IT land-

scape in detail such that workflows can be specified, monitoring rules can be

generated, and computing entities can be put into relation.

ii The detection of workflow aberrations, or semantic gaps, caused by attacks

via Complex Event Processing (CEP) based on monitoring rules generated

by the model. CEP is a signature-based method to analyze event streams in

a midtoupper size IT infrastructure [13]. The purpose of CEP is to derive

more meaningful events (in this case alerts).

iii The detection of abnormal entities, i.e. users, services, network hosts, and

workflows, by leveraging unsupervised machine learning, to detect unforeseen

changes in the behavior.

The application of our framework in a cloud-sourced health-care environment

provides the means necessary to unravel the following incidents:

- Semantic Gaps. A document retrieval workflow doctor accessing the database

without proof of first having received a permission token, replay attacks,

workflow aberrations through patched code.

Trustworthy Eternal Systems via Evolving, Software Data and Knowledge

Search WWH ::

Custom Search

Home