Information Technology Reference
In-Depth Information
Service
Src
Port
Dest
Port
Src
IP address
Dest
IP address
Class
Attack/Normal
http
1106
80
192.168.001.005
192.168.001.001
Normal
telnet
20504
23
172.218.117.069
172.016.113.050
loadmodule
Fig. 1. Dataset records, each one has a number of attributes. Class attribute has two
categories, normal or attack. The rest of the attributes have many values.
other is discontinuous. The algorithm includes two steps, the first step is to search
large-sequences of the first type of patterns, and the second step is to search the
second type of patterns. In the following, the steps are summarized as follows:
- All attribute values in records database are considered as candidates to
1-element-zero-star-sequence-itemset, C (1,0). After generating C (1,0), the
records database is scanned vertically. If the elements of C (1,0) are con-
tained in any instance, then the support of that element adds 1. Insert any
element with support value greater than the given minimal support in 1-
element-zero-star-sequence-large-itemset, L (1,0), and store the results in a
temporary database.
- Each two elements from two different attributes in L (1,0) are combined to
form 2-element-sequence-itemset-zero-star, C (2,0). The records database is
scanned for all patterns existing in C (2,0). When the support value of a
pattern exceeds the given minimal support it inserts in 2-element-sequence-
large-itemset-zero-star, L (2,0). We find out all k -element-large-zero-star
L ( k ,0) and store in a temporary database in turn. And then, we list all large-
zero-star-sequence, L (1,0), L (2,0),..., L ( m ,0), and store them in a common
database called super large sequences set, SupL .
- After generating all possible L ( k ,0), we extract all discontinuous patterns.
First, from the temporary database of L (3,0) we found out 2-element-1-star-
sequence C (2,1) by replacing the second item of the pattern by star. And
then the records are scanned vertically for each pattern existing in C (2,1),
the patterns that have a support value exceeding the given minimal support
are inserted in 2-element-zero-star-sequence-large-itemset, L (2,1). We then
found out all 2-element- l -star-large-itemset L (2, l ), and list all large- l -star-
sequence, L (2,1), L (2,2), ..., L (2, l ). We do the same thing for all k -element-
large-zero-star L ( k ,0) in turn. The resulting sets add to SupL database. These
steps are shown in Figure 2.
In order to describe the algorithm clearly, we will take the example of an
attack that includes 5 items and generate all possible sequences, which are shown
in figure 3.
2.3
Complexity Analysis
The proposed algorithm is very different from Apriori algorithm [18]. First, dis-
continuous sequences are not considered in Apriori algorithm. Second, item-
Search WWH ::




Custom Search