Information Technology Reference
In-Depth Information
3.3
Generation of Finite Automata
A finite automaton modeling normal behavior of a program is constructed from the
result of the multiple sequence alignment. First, any symbol s -either a system call or
a macro- that repeats many times in consecutive positions is replaced by s+ and con-
verted into a state with the self-loop. Then the finite automaton is constructed by
following the rule that the same symbols at the same position in different sequences
lead the automaton to enter the same state. The gap symbol is treated as the same way
as the other regular symbols. Figure 7 shows the finite automaton constructed from
the result of multiple sequence alignment in Figure 6.
Fig. 7. Finite automaton constructed from Fig. 6
Once a finite automaton is constructed, the system call sequences generated by
execution of programs are monitored. Basically, a sequence is considered normal if it
is accepted by the automaton, and abnormal if rejected. But such a clear-cut criterion
is too strict to tell normal behavior from abnormal. Hence we adopted the following
scoring scheme. Every time the current symbol in the sequence causes a legitimate
move from the current state, the sequence gets a certain positive point. Every time the
automaton makes a gap-move, a certain amount of points is subtracted. If the automa-
ton cannot make a move with the current symbol, it skips the current symbol and
some points are subtracted. If the accumulated points at the time when the automaton
reaches its final state is over a predefined threshold, the sequence is considered to be
normal. Otherwise it is considered as abnormal.
For example, assume that a legitimate move is worth 3 points, a gap-move is worth
-1 point, skipping a symbol is worth -3 points, and the threshold is 0. Then the se-
quence ATCCTCATT gets +3+3+3-1+3-3+3+3+3+3-1-1 = 18 points by the automa-
ton in Figure 4, and it is considered as normal.
From the way the score of a sequence is computed, it can easily be observed that
the finite automaton does not have to be actually built. The result of the multiple
alignment serves as the finite automaton. Hence there are not needs for the extra space
for storing the finite automaton. The result of multiple sequence alignment will do.
4
Experiments
We implemented our system on a Pentium 3 500 MHz processor with Windows 2000
server operating system in Microsoft C# language. We experimented with the system
Search WWH ::




Custom Search