Information Technology Reference
In-Depth Information
2. Search for execution chains : This phase analyzes the data after the NOP
zone by using a recursive function capable of following different execution
chains in disassembled code. Whenever a controlflow instruction is detected,
the function extracts the destination address and continues disassembling at
this address. Depending on the instruction the function also follows the code
directly after the instruction. For a similar approach we refer to [19].
3. Neural network classification: Whenever a termination criterion is met
(see [13] for details), the recursive function stops to follow the code and
starts neural network classification.
The input for the neural network is the spectrum of encountered in-
structions along an execution path. (Here and in the course of this paper, by
spectrum we mean a representation of the relative frequencies.) If the output
of the neural network is larger than zero, a possible shellcode is reported.
The features of the neural network were chosen by investigating the in-
structions used by the available polymorphic shellcode engines. These in-
structions were then used to create groups of similar instructions. Further
instructions from the X86 set were then added to the groups. The groups
are numbered and represent the features/inputs for the neural network. A
complete list can be found in [13].
Results:
HDE was evaluated with six shellcode engines. There are three public available
engines, that can be used to generate polymorphic shellcodes. These are ADM-
Mutate [7], CLET [4] and JempiScodes [17]. With the knowledge we got from
investigating these engines, we also made up our minds on alternative methods to
generate polymorphism. As a result, we developed three independent shellcode
engines which are based on different concepts.
In what follows, we will call these engines EE1, EE2 and EE3 (Experimental
Engine). The purpose of these engines was to improve our detection mechanism
by experimenting with concepts that could possibly evade HDE. EE1 was based
on inserting junk instructions and XOR encryption. Such a mechanism was also
proposed by the authors of [4]. EE2 uses the Tiny Encryption Algorithm (TEA)
to encrypt the payload. EE3 uses random chains of simple instructions which
are applied to the payload to transform the payload. The inverted instruction
chain serves simultaneously as decryption engine and key.
Evaluation of HDE was made by training six neural networks (one for each
polymorphic shellcode engine) and applying them to test data provided by the
six engines and to real data known to be free of shellcodes. The results can be
seen in table 1. To increase the detection accuracy for unknown engines, a new
network was trained with positive training data used for the two best neural
networks (ADMMutate and EE3) 2. In general, evaluation shows that HDE is
able to detect engines not available during the training process.
2.2 Self-organizing Maps
Since we already applied the theory of Self-Organizing Maps in the context of
trac classification (cf. [14]), we also wanted to see them perform in anomaly
detection. For the theory of SOMs, we refer to [8].
Search WWH ::




Custom Search