Information Technology Reference
In-Depth Information
Static processing and dynamic tracing are crucial components for supporting cross-
platform analysis at the instruction set architecture (ISA) level and the operating system
(OS) level. ISAs often differ significantly in their encoding and semantics of the instruc-
tions. Operating systems often differ in how they use registers to represent high-level
data structures. For example, Windows and Linux use fs and gs segment registers for
very different purposes. In our system, however, these differences are mostly removed
due to the use of a common IR. In the front-end, only a thin layer needs to deal with
remaining subtle differences. In the back-end, all core analysis algorithms are based on
the common IR.
We shall use the program called basicov plus.exe in Fig. 2 as the running exam-
ple. It reads the data inputs from a file and adds each input byte, except for the last two,
with its right neighboring byte. If the first byte is 'b' , the transformed bytes are fed to
a vulnerable function called StackOverflow . The function is vulnerable in that, if the
input is larger than a local buffer inside the function, there will be a buffer overflow,
causing the return address to be overwritten. Although the program is small, it consists
of all the important elements of a typical security vulnerability: the potentially tainted
data source (input), the transformation (addition), the trigger (path condition), and the
anomaly manifestation (buffer overflow). In practice, of course, each of these elements
can be significantly more complex. For example, the transformation itself may involve
not just one instruction but a few millions of instructions.
Fig. 2. Example: A Conditional Buffer Overflow Program
Static Processing. There are two main components for static processing. One com-
ponent is responsible for pre-processing the binary code statically and building a map
from each native instruction to a set of IR instructions. Another component consists of
a set of simple static analysis on the resulting IR, e.g. to identify interesting locations
that are potential targets of the subsequent dynamic analysis.
Table 1 shows the mapping from a few instructions used by the program in Fig. 2
to the IR instructions. In this table, the native x86 instructions are shown in the first
column. The corresponding IR translations are shown in the second column. For exam-
ple, the native x86 instruction at the address 0x00401073 is mapped to the sequence of
REIL instructions from the imaginary address 0x0040107300 to the imaginary address
0x0040107306 . We postpone our detailed presentation of the IR format, called REIL
 
Search WWH ::




Custom Search