Information Technology Reference
In-Depth Information
increasing the expense in terms of the human effort involved in using the tech-
nique. The idea of using data constraint inference to augment the EDSM process
forms part of our future work.
Such an approach can however not be applied as-is to large-scale models,
derived from realistic traces. Its effectiveness is dependent on the effectiveness
of the constraint-inference technique. Daikon (used by Lorenzoli et al. )hassome
important limitations. It was only designed to identify simple, linear constraints,
and any constraint types have to be pre-supplied to the tool.
With respect to our technique, we intend to extend it in the manner of Loren-
zoli et al. . To apply to realistic traces, it will necessitate the investigation of
more powerful data constraint / function identification techniques - techniques
that can identify more relevant complex data transformations, that perhaps in-
corporate nonlinear variable relationships (c.f. work by Bongard and Lipson on
identifying non-linear functions from data [28]).
5.2
Identifying the Primary Functions in a Trace
All current reverse-engineering approaches, including those discussed in the pre-
vious subsection, make the assumption that the functions used to label the edges
are trivially known (i.e. “edit” and “save”). It is presumed that these functions
clearly map to a given trace (i.e. “edit” corresponds to a method in the trace
called “edit” etc.). This is fine if the trace contains only a small number of differ-
ent types of event (such as the traces for the TCP example used above). However,
when traces scale up to larger and more complex systems, this becomes impossi-
ble. A trace of a trivial Java system for example will often encompass hundreds
of thousands calls to thousands of different methods. A simple operation to load
a text file might encompass hundreds of different input/output library methods
to read the file, and hundreds of font-rendering library methods to render the
characters onto the screen for example. Given such a trace, how do we reduce it
to a sequence of symbols that can be used to infer a state machine?
To illustrate the problem, we use a simple example of an openly-available
Java drawing application called JHotDraw. We may want an abstracted state
machine that describes its core functionality. To do so, it is executed and record
the trace is recorded: we create three new drawings, and insert five figures into
each drawing. The figure in 6 shows the result in JHotdraw.
The problem facing us is this: the ensuing trace contains 161,087 method calls
to 1489 different methods 6 . To reverse-engineer a machine from such a system, we
need to map this extremely large trace to a sequence of symbols that will result
in a machine that is readable and can be readily understood by the developer.
So far, model inference work implies that this is a relatively straightforward
process [14]; but it can be a very challenging task - especially if the developer is
not familiar with the functionality, let alone the architecture of the underlying
system.
6 The trace was recorded by Cornelissen et al. [29] and can be downloaded from their
website.
 
Search WWH ::




Custom Search