Detecting Malicious Codes by the Presence of Their “Gene of Self-replication” - Computer Network Security

Information Technology Reference

In-Depth Information

This approach presents an attempt to go beyond “sample matching.” The intention

is to concentrate efforts on the detection of the one generic feature of all computer

viruses, the “gene of self-replication,” which is typically present in computer viruses

and virtually unknown in legitimate software. This task will be performed not on

binary sequences, but on sequences of instructions that can be understood as letters in

an alphabet (the approximate number of letters in the alphabet is very close to the

total number of instructions), with the understanding that although self-replication can

be achieved in a number of ways, this number is finite and is believed not to exceed

50. Therefore, the search for the “gene of self-replication” can be understood as the

search for particular words on the array of letters, almost like a crossword puzzle with

the following peculiarities.

First, instructions form multiple strings with a well-defined order of execution.

This feature simplifies the task by eliminating concern over the position of particular

words (strings of interest): all words are positioned along the string and should be

read from left to right, in the order of execution.

Second, the string of instructions, for example, forming the word “ replication ” that

represents a particular self-replication procedure does not have to be continuous. In

the process of execution, the self-replication task can be temporarily interrupted to

perform malicious or auxiliary subtasks, for example a display of offensive messages.

This makes the search more difficult. It requires the search to expand from finding the

word “ replication ” as a continuous string of letters to searching for a letter “ r ” that is

eventually followed by letter “ e ” that is eventually followed by letter “ p ”, etc. Fortu-

nately, there are some decryption and deciphering techniques that could be utilized for

this problem.

Third, malicious code can arrive partially encoded and decode itself prior to execu-

tion, which presents a serious challenge for any virus detection method. This diffi-

culty, however, could be addressed through periodic interruption of the execution of

the code in question and analysis of the composition of the executable image. Another

approach implies the monitoring and analysis of the sequence of macro commands

presented for execution.

Although there are questions about the feasibility of detecting a computer virus by

subjecting its code to a cryptographic analysis, this approach concentrates on a very

narrow task: the detection of a particular feature of a malicious code, its “gene of self-

replication.” In addition, the proposed detection procedure will analyze not a “static”

file containing the code in question, but the sequence of executable instructions that

evolves during the execution of the code. Finally, a probabilistic approach resulting in

the computation of the conditional probability of maliciousness subject to particular

features discovered in the executable code can be utilized. Indeed, while according to

[2] sufficient conditions for the detection of computer viruses may not exist in the

mathematical sense, this approach is aimed establishing the necessary conditions and

then utilizing these conditions for the development of instrumental, general-purpose,

anti-virus software capable of detecting new, previously unknown, computer viruses.

First, several typical sequences of instructions that implement the task of self-

replication will need to be established. These sequences will constitute the set of

“words” or “patterns” that would provide evidence that the code may be a computer

virus. Constructing a number of alternative self-replication procedures and subjecting

them to special analyses/parsing in order to detect their generic semantic features will

accomplish this task. At the same time, known computer viruses will be subjected to

analyses aimed at the detection of their “gene of self-replication” and will attempt to

Computer Network Security

Search WWH ::

Custom Search

Home