Information Technology Reference
In-Depth Information
instrumented automatically through a specialized
Interface Description Language (IDL) compiler,
which directly modifies the generated stubs and
skeletons with code that records call entry and
return events, as shown in Figure 8.
Along with calls and returns, the MCBS-
modified stubs and skeletons can also profile
higher-level transactions (as aggregated calls),
as well as parameters and return values. Event
data is recorded to a log and a unique identifier
assigned so that the scenario/call chain can be
identified later. This identifier is generated at the
start probe and is propagated through the calling
sequence via thread-local storage (Schmidt et al.,
2000), which is global data that is only available
to the owning thread. When each new interface
is invoked, the stub receives the identifier from
the thread-specific storage, creates a record with
it, and stores a number identifying its position in
the call chain. After control returns to the caller
stub, the last record is generated and the call chain
record completed.
Whenever a new thread is created by the ap-
plication, the parent thread identifier is stored
along with the new thread identifier to help identify
the logical call chain in cases where threads are
spawned by user-application code. Event data
is stored in a memory buffer during application
execution and is dumped to a file regularly as
the buffer becomes full. An off-line data collec-
tor picks up the different files for the different
processes and loads them into a database. The
analyzer component processes the data and con-
structs entire call graphs. The end-to-end timing
latency of call scenarios is calculated from the
timestamps and latencies calculated from their
deltas.
MCBS also allows the comparison of mea-
surement overhead against normal (uninstru-
mented) operation. This comparison measures
the instrumented execution timing with timings
collected from the original application that has
been manually instrumented. The manual in-
strumentation is restricted to a single function
at a time to minimize overhead. Table 6 shows
performance data for a sample application. The
sample scenarios are known to have deterministic
functionality, that is, they perform the same set
of actions every time.
MCBS can reduce measurement overhead
by profiling only specific components of the ap-
plication. Component selection can be achieved
in two ways:
Table 6. Overhead of instrumentation due to probes inserted in stubs and skeletons
Function
Average
(msec)
Standard
Deviation
(msec)
Average
(msec)
Standard
Deviation
(msec)
Interference
EngineController::print
1.535
0.158
1.484
1.288
3.4%
DeviceChannel::is_supplier_set
1.865
0.054
1.236
0.025
50.9%
IO::retrieve_from_queue
10.155
0.094
9.636
0.094
5.4%
GDI::draw_circle
95.066
10.206
85.866
11.342
10.7%
RIP::notify_downstream
13.831
2.377
11.557
0.381
19.7%
RIP::Insert_Obj_To_DL
2.502
0.141
1.879
0.127
33.2%
IO::push_to_queue
13.626
0.298
13.580
2.887
0.3%
UserApplication::notified
0.418
0.04
0.282
0.099
48.3%
Render::deposit_to_queue
0.529
0.097
0.358
0.010
47.8%
Render::render_object
7.138
2.104
6.280
0.074
13.6%
Render::retrieve_from_queue
0.501
0.040
0.318
0.010
57.6%
 
Search WWH ::




Custom Search