Information Technology Reference
In-Depth Information
- SupL ( k,l ): The super large set, SupL ( k,l ), used to store the list of all sup-
ported sequences of both types continuous and discontinuous.
- Pattern: also called sequence, it is a number of ordered actions. the pattern
Xcanbeshownas( x 1 ,x 2 , .., x n ), each x j means an item or element.
- record: single instance of an attack. If an attack is involved in multi-instances,
then we say attack records for all involved instances.
Definition 2 (continuous patterns). Suppose a pattern S i extracted from
the sequence X i = {x 1 ,x 2 , ..., x m } and contains some actions, that is, S i =
{s 1 ,s 2 , ..., s l } which may reflect ordered commands executed by a program run
on a computer machine. The pattern S i can be classified as continuous pat-
tern if all contained elements appear in consecutive positions of the sequence
X i , such that, there is an integer r such that; s 1 = x r ,s 2 = x r +1 , ..., s d =
x r + l− 1 . For example , the continuous pattern ( s 3 ,s 4 ) occurs in sequence: X 1 =
( s 1 ,s 2 ,s 3 ,s 4 ,s 5 ,s 6 ).
Definition 3 (discontinuous patterns). We say that S i is a discontinuous
pattern if the elements of that pattern don't appear in consecutive positions of
the sequence X i , that is, if there are existing integers r 1 <r 2 < ... < r l such that
s 1 = x r 1 ,s 2 = x r 2 , ..., s l = x r l . For example , the pattern ( s 1 ,
,s 4 ) in sequence:
X 1 =( s 1 ,s 2 ,s 3 ,s 4 ,s 5 ,s 6 ) is a discontinuous pattern.
Definition 4 (star patterns). Star pattern is a pattern that contains one
star or more as part of its elements. In a discontinuous pattern, hidden elements
represented by star “
” which is defined as a variable number of intermediate
elements. The star pattern never starts or ends by “
”. For example, if we
have a sequence X i =
, we may have these continuous patterns
( x 1 ,x 2 ) , ( x 2 ,x 3 ,x 4 ), and ( x 1 ,x 2 ,x 3 ), or this discontinuous pattern ( x 1
{x 1 ,x 2 ,x 3 ,x 4 }
x 3 ,x 4 ).
Because of the definition of the “
”, the pattern ( x 1
x 3 ,x 4 ) implicitly has two
other patterns: ( x 1 ,x 3 ,x 4 ), and ( x 1 ,x 2 ,x 3 ,x 4 ).
2.2
Data Analysis and Patterns Generation
DARPA 1998 off-line data sets [17] developed to evaluate any proposed tech-
niques for intrusion detection. These data prepared and managed by MIT Lincoln
labs, sponsored by DARPA, and contain contents of every packet transmitted
between hosts inside and outside a simulated military base. There were a collec-
tion of data including TCPDUMP and Basic security module (BSM) audit data
of a victim Solaris machine. Both types are used in this work. While we used
BSM data to model users normal behavior, we preprocessed and used tcpdump
data set to model attack behavior. tcpdump records consist of a number of at-
tributes as items of sequences, and these items include class attribute and other
attributes, which are shown in Figure 1.
The aim of the proposed algorithm is to find out all frequent patterns from an
attack records. Compared with CTAR or even with traditional Apriori algorithm,
the proposed algorithm mines two types of sequences, one is continuous, and the
Search WWH ::




Custom Search