DSP Instruction Set Simulation - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

Compiled Simulation

Mills et al. were the first to use compiled simulation [ 45 ] . They presented a simple

and efficient method to translate medium sized applications into a fast compiled

simulator. Their technique inspired all the later work.

Reshadi et al. compile only the instruction decoding part to achieve the flexibility

of an interpretive simulator while reaching a simulation speed which is close to

compiled simulation [ 58 ] .

SyntSim is a generator for functional simulators [ 11 ] . Using profile information,

parts of the executable are compiled. The performance is between a factor of 2 and

16 slower than native code.

Errico et al. generate interpretive and both static and dynamic compiled simula-

tors from a simulator specification [ 25 ] . The dynamic simulator generates C code at

run time which is compiled by GCC and dynamically loaded. Translation of aligned

code pages is done identically both for the static and the dynamic simulator.

Dynamically Compiled Simulation

SimOS [ 61 ] is a full-system simulator that offers various simulators including

Embra [ 73 ] , a fast dynamic translator. Embra focuses on the efficient simulation of

the memory hierarchy, in particular the efficient simulation of the memory address

translation and memory protection mechanisms, as well as caches. Embra follows

a compile-only approach, i.e., simulated instructions are always translated to native

code and then executed. The translated code fragments for basic blocks are stored

in a translation cache to speed up the look-up of native code blocks during the

simulation. The basic simulator can be extended and adapted using customized

translations. For example, different cache configurations and coherency protocols

are realized using these customized translations.

Shade [ 18 ] is another dynamically compiling simulator that aims primarily

at fast execution tracing. It offers a rich interface to trace and process events

during simulation. Similar to the customized translations in Embra, Shade allows

user-defined as well as pre-defined code to collect trace data. Trace collection

is controlled by analyzers that specify whether information should be considered

during tracing on a per-opcode or per-instruction basis. The tracing level, i.e., the

amount of data collected during simulation can be varied at runtime. In this way,

only critical portions of the program execution need be executed with full tracing.

Ebcio glu et al. present sophisticated code generation techniques to efficiently

generate code for parallel processors at runtime [ 21 , 23 ] . Several code genera-

tor optimizations are presented and combined with runtime statistics collection.

For example, instructions of the target processor are initially interpreted, and

compilation of tree regions is only triggered for hot paths of the simulated

program. Instruction-level parallelism is further improved by aggressive instruction

Search WWH ::

Custom Search

Home