Information Technology Reference
In-Depth Information
and impact of different profiling techniques.
Our experimentation is based on measurements
taken from Microsoft Windows XP, running on
a dual-processor, hyper-threaded (Intel Xeon 2.8
GHz) system, executing a stress-test Web client/
server application. The measurements were taken
using both Windows performance counters and
the on-chip Intel performance counters. Table 3
shows the results.
The data listed in Table 3 is comprised primar-
ily of finer-grained metrics that occur at very high
frequencies in the lower levels of the system. Of
course, less frequent “application-level” events are
also of interest in understanding the behavior of
a system. For example, rare error conditions are
often of importance. The data in Table 3 shows
that the frequency (and therefore quantity) of
measurable events can vary signifi cantly by up
to nine orders of magnitude. Because the impact
of measurement is scaled proportionally, analysis
methodologies that work well for low-frequency
events may not do so for higher-frequency
events.
challenges of multithreaded System
Profiling
The remainder of this chapter focuses on the re-
alization and application of runtime profiling on
multithreaded sys tems. Profiling multithreaded
systems involves addressing the following key
challenges:
Measurement of events at high frequen-
cies—Events of interest typically occur at
high frequency. The overhead and effect of
measurement on the system being measured
must be controlled carefully. Without careful
control of overhead, results become skewed
as the process of measurement directly alters
the system's behavior.
Mapping across multilevel concepts—
Threads can be used at multiple levels of a
system. For example, threads can exist in the
operating system, virtual machine, middle-
ware, and in the application (lightweight
threads and fibers). Virtual machine and
application-layer threads can map to under-
lying operating system threads. Extracting
Table 3. Example metric ranges
Category
Metric
Range
Processor
Clock Rate
2,793,000,000 Hz *
Micro-ops Queued
630,000,000 uops/second *
Instructions Per Second
344,000,000 instructions/second *
L2 Cache Reads
65,000,000 reads/second *
Thread Schedul-
ing
Number of Threads
500 total count
Context Switch Rate
800-170,000 switches/sec
Thread Queue Length
0-15 total count
Scheduling Quanta
20-120 ms
System Resources
System Calls
400-240,000 calls/sec
Hardware Interrupts
300-1000 interrupts/sec
Synchronization Objects
400-2200 total count
* per logical processor
Search WWH ::




Custom Search