Information Technology Reference
In-Depth Information
//build training normal patterns data set
Extract large-sequence L ( k,l ) of training dataset, and store in SupL N ;
for each process X in the testing data set Do
extract all large-sequence L(k,l) patterns, and store in SupL S ;
get value of n ; // extracted programmatically
compare SupL N and SupL S and get k n ;
calculate k n /n ;
if k n /n ≥ threshold then
The process X is normal;
else then
The process X is abnormal;
Fig. 6. An algorithm code for anomaly classifier
base called “normal pattern database” and denoted by SupL N , and used later
as a normal profile during monitoring and classifying testing processes.
Detection Process. This phase is intended to classify the testing processes
to intrusive or normal. Once we have the training patterns data set for normal
behavior, testing audit data is scanned for each new process associated with the
same chosen 6 users. The new processes are also transformed to their related
large-sequences patterns, L ( k,l ). All possible patterns were generated for each
testing process, and stored in a temporary database called “suspicious patterns
database” and denoted by SupL S . Then the similarity between patterns of the
new process and the patterns of normal processes is calculated using similarity
algorithm.
The similarity algorithm is described as follows: for any testing process that
is needed to be classified, first, all corresponding large-sequence patterns L ( k,l )
are extracted, and then each single generated pattern that is represented in
SupL N database is given a weight w =1 /n ,where n is the total number of
all extracted patterns of that specific testing process. The value of n can be
extracted programmatically. The value of w falls in the range (0
1).
By calculating the total summation weights ( k n ) of all matches, strength of the
normality signal can be determined. If the total weights summation exceeds a
certain threshold, the testing process is classified as normal. Otherwise, it is an
anomalous process. In Figure 6 an abstract of the pseudo code of the similarity
algorithm is given.
w
Performance Measurements. Based on similarity function return value, the
classifier makes the decision whether the process under investigation is intrusive
or not. The first error that may occur is the false positive error which occurs
when normal processes are classified as intrusions. The second error type is the
false negative error which occurs when the real intrusive process is classified as
normal, which is more serious.
Search WWH ::




Custom Search