Information Technology Reference
In-Depth Information
- U2R: unauthorized access to local superuser (root) privileges (e.g., various
"buffer overflow"' attacks);
- probing: surveillance and other probing (e.g., "port scanning").
It is important to note that the test data is not from the same probability distribution
as the training data, and it includes specific attack types not in the training data. This
makes the task more realistic. Some intrusion experts believe that most novel attacks
are variants of known attacks and the "signature" of known attacks can be sufficient to
catch novel variants. The datasets contain a total of 24 training attack types, with an
additional 14 types in the test data only.
Two data files from UCI KDD archive has been used for testing the emulator:
- File 1: kddcup_data_10_percent_gz.htm (7.7 MB);
- File 2: kddcup_newtestdata_10_percent_unlabeled_gz.htm (44 MB).
File 1 is the training data file. It contains 51608 network connection records. Any
record (file string) has the following format, where parameters 2, 3, 4, 42 are
symbolic, while other 38 parameters are numerical (real values):
1) duration, 2) protocol_type, 3) service, 4) flag, 5) src_bytes,
6) dst_bytes, 7) land, 8) wrong_fragment, 9) urgent, 10) hot,
11) num_failed_logins, 12) logged_in, 13) num_compromised,
14) root_shell, 15) su_attempted, 16) num_root, 17) num_file_creations,
18) num_shells, 19) num_access_files, 20) num_outbound_cmds,
21) is_host_login, 22) is_guest_login, 23) count, 24) srv_count,
25) serror_rate, 26) srv_serror_rate, 27) rerror_rate,
28) srv_rerror_rate, 29) same_srv_rate, 30) diff_srv_rate,
31) srv_diff_host_rate, 32) dst_host_count, 33) dst_host_srv_count,
34) dst_host_same_srv_rate, 35) dst_host_diff_srv_rate,
36) dst_host_same_src_port_rate, 37) dst_host_srv_diff_host_rate,
38) dst_host_serror_rate, 39) dst_host_srv_serror_rate,
40) dst_host_rerror_rate, 41) dst_host_srv_rerror_rate, 42) attack_type.
For example, two records (# 1 and # 745) of File 1 are as follows:
0,tcp,http,SF,181,5450,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,
0.00,0.00,1.00,0.00,0.00,9,9,1.00,0.00,0.11,0.00,0.00,0.00,0.00,0.00,
normal.
184,tcp,telnet,SF,1511,2957,0,0,0,3,0,1,2,1,0,0,1,0,0,0,0,0,1,1,0.00,
0.00,0.00,0.00,1.00,0.00,0.00,1,3,1.00,0.00,1.00,0.67,0.00,0.00,0.00,
0.00, buffer_overflow.
File 1.1 has also been prepared with the same 51608 records of the same format
just without the last parameter 42) attack_type.
File 2 contains 311079 records of the same format as in File 1.1.
File 1.1 and File 2 are the test data files.
Note that KDD archive does not indicate the correct types of attack for none of the
records of File 2. The only available information on possible attacks is gathered in
Tab. 1 (column 'Code' is the emulator's code of attack). Nevertheless, we have used
File 2 to test whether the emulator is able to detect unknown intrusions, which had not
been presented in the training data of File 1.
Search WWH ::




Custom Search