Information Technology Reference
In-Depth Information
Table 1. Sample of ordered normal system calls included in two processes 118, and
102, Executed by the user named by: franko within the first day of the first week of
the training 1998 DARPA data set
Process System calls
118
stat stat stat stat chdir chdir lstat
stat stat open chdir chdir lstat stat
stat open pathdonf stat stat open chdir
pathdonf stat open chdir pathdonf stat
stat open chdir
102
stat stat stat stat access stat open
open access stat open open
Learning Process. DARPA simulated BSM audit data set featured 6 users
whose activity can be used to test anomaly detection systems. The users are
named as: franko , georgeb , janes , fredd , williamf ,and donaldh . The activity of
those users remains consistent from day to day, but on some days, those users
exhibit anomalous behavior in ways that should be detectable to an anomaly
detection system. The anomalies that are introduced into the users' sessions in-
clude logging in from a different source, logging in at an unusual time, executing
new commands, and changing identity. In the training data, all anomalies were
introduced during the 6 th week.
Among the seven weeks training period of DARPA data set, there are 6
weeks free of anomalous behaviour. Arbitrarily, 2 weeks (the first and the second)
picked as a training data set, and left the sixth week for testing.We recorded only
the names of the ordered system calls executed by those 6 users. Users names
are usually found in two attributes: path or mail . Any process not related to
any one of those users are ignored in either data sets, training or testing. The 2
weeks training data set consists of 17 intrusive instances and 17 clear or stealthy
attacks. There are 7798 sessions within these 2 weeks. These normal training
processes run only on Solaris machine. Once we have the training data set for
the normal behavior, each single process is transformed to its related continuous
and discontinuous patterns.
The proposed algorithm is used to generate all large-sequences L ( k,l ) patterns
that could be contained within one normal process. All system calls within one
process are considered as a candidate to 1-element-sequence-itemset and stored
in C (1,0). This collection of patterns are used as a normal profile.
At a certain detector window size k , Large-sequences L (1,0) patterns of only
one process were generated in each run. A single process may contain a number
of elements more than the detector window size, in this case, we applied the
algorithm for the first k elements, and then moved to the next k elements until
we covered all the elements included in the process.
We look for all normal processes separately and generate super-large- se-
quences, SupL . The resulting normal patterns are stored in a temporary data-
Search WWH ::




Custom Search