Dynamic Analysis and Debugging of Binary Code for Security Applications - Runtime Verification

Information Technology Reference

In-Depth Information

non-x86 ISAs such as ARM, PowerPC, and MIPS, but runs only on Linux. None of

them provides kernel mode instrumentation. Whole-system emulators can provide ker-

nel instrumentation, but often through an additional instrumentation layer that is not

portable to new versions. For example, tools built on the QEMU simulator, such as

TEMU [6], DroidScope [15], and S2E [14], have different instrumentation layers. In

each case, the implementation is tied to a specific microcode used by QEMU, making

it difficult to port. Therefore, although it is well-known that Android builds upon a cus-

tomized version of QEMU, porting the aforementioned tools to Android is challenging.

In contrast, we propose to use the debug breakpoint mechanism [7] for dynamic trac-

ing. This mechanism, already used by interactive debuggers such as gdb , is supported by

almost all processors and operating systems. Therefore, it provides a unified approach for

collecting execution traces from different platforms. It can collect traces in kernel mode.

It can also collect traces on real devices such as Cisco routers and Android phones, since

almost all of these devices have development tools that provide the breakpoint capability.

This debug breakpoint approach has significant advantages over DBI tools. Running in-

side the target process, DBI tools often disturb the behavior of the target program, e.g. by

affecting the target's stack and heap layout. This is a serious problem because interesting

scenarios in security applications tend to manifest only in certain program states.

Our experience shows that breakpoint based tracing is effective for short and interac-

tive analysis. To support long traces, our system leverages existing DBI tools and whole-

system emulators, e.g. PIN plug-in for Windows/Linux x86 for trace generation. We have

implemented a heuristic algorithm to automatically switch between these techniques, in

order to use the best instruction tracer available in each individual application scenario.

Trace Format. The execution trace starts with a snapshot of the program state, which

consists of the module, thread, stack, and heap information. The program state is a

valuation of the set R of registers for all threads, including privileged registers for kernel

mode, and a global memory map M . Therefore, we have the program state represented

as PS = {R,M}

.

A tracer on a particular platform would record the finite sequence of events starting

from the initial state. An event is an execution instance of an instruction that trans-

forms the program state PS into a new program state PS . Each event in the trace has

a unique sequence number. The vast majority of events in a trace are of the form I=

{ instInfo, threadID,relevantRegisters, memoryAccess } ,where instInfo

contains the address of the instruction, the encoding bytes, and the size, threadId

is the index of the thread that executes this instruction, relevantRegisters and

memoryAccess contain values of the related registers and memory elements before

this instruction is executed.

Trace can be optimized to reduce the size while maintaining the same amount of

information required by the subsequent analysis. In our implementation, we record only

the information that is relevant to the subsequent analysis. For example, for instruction

movsx edx, byte ss:[ebp-10] , our trace includes the values of registers edx and ebp .

For user mode analysis, we capture the precondition and postcondition of each system

call or call to a standard library function as a function summary, to avoid recording the

large number of instructions inside the function. For example, after a call to ReadFile ,

we record the address of the input buffer, the input size, and the content of the buffer.

Runtime Verification

Search WWH ::

Custom Search

Home