Information Technology Reference
In-Depth Information
a one-to-one mapping of the native instruction address to the imaginary IR address.
For example, in Table 1, the x86 instruction movsx edx, byte ss:[ebp-10] at address
0x401073 will always be mapped to a list of REIL instructions from 0040107300 to
0040107305 . Therefore, it is easy to map the analysis results back to the native forms
before reporting them to the user.
REIL has a simple register-based architecture, which can keep native registers and
create temporary registers when needed. Preserving native registers is particularly use-
ful for implementing the offline concrete and symbolic (or concolic) execution. Recall
that in concolic execution, the program state has to be saved during trace generation
and later reconstructed during the offline analysis. At runtime, our trace generator will
only save the native program state (related native registers and global memory). During
the offline analysis, we can compute the IR program state directly from these native
registers and the memory.
In all of the seventeen REIL instructions, the destination operand can be represented
by a mathematical or logical formula of the source operands. Consider the second native
instruction 00401077 cmp edx, 0x62 in Table 1. Notice that the REIL instructions use
a few basic mathematical and logical operations to precisely compute all the eflags ;
in other words, all the eflags can be represented as an expression in terms of edx
and 0x62 . For example, ZF = (edx 98) and 0xffffffff .Insomesense,REIL
instructions are compatible with the input language of the satisfiability modulo theory
(SMT) solver Z3 [18], which supports the theories of integers, bit-vectors, and arrays.
3.2
Symbolic Execution
The symbolic execution procedure consists of three steps:
1. Mark taint source and symbolize its value. Here, taint sources refer to the untrusted
data in the target program. When a program variable is marked as a taint source, our
tool symbolizes the variable, by replacing its concrete value with a symbolic one (a
free variable). Traditionally, the taint sources are program inputs. However, during
interactive security analysis, the user may be interested in tracking other program
variables as well. For example, some sensitive data items such as the password
and the registry key may become the focus of the analysis. At any time during the
program execution, CBASS can mark any byte in any register or at any memory
location as the taint source.
2. Symbolic execution of REIL instructions. CBASS implements the symbolic execu-
tion engine based on the REIL IR. As we have already mentioned, the semantics
of REIL instructions can be close to that of the input language of the SMT solvers.
Therefore, the symbolic encoding procedure, which takes an IR trace as input and
returns an SMT formula, is straightforward. In our implementation of the proposed
system, we have used the Z3 SMT solver, which is capable of solving formulas
expressed in the theories of bit-vectors and arrays.
3. Check taint sink to construct constraint. Depending on the application, security
analysts may mark different memory location or register at some interesting point
as the taint sinks. For example, to generate potential exploits, the taint sinks are
usually registers such as EIP . We may create a constraint to steer the execution into
a desired code section and make EIP equals to the address of that code section.
 
Search WWH ::




Custom Search