Information Technology Reference
In-Depth Information
Box 3.
08048373 <foo>:
8048373: 55
push
%ebp
8048374: 89 e5
mov
%esp,%ebp
8048376: 83 3d
78
95
04
08
09
cmpl
$0x9,0x8049578
804837d: 7f 08
jg
8048387 <foo+0x14>
804837f:
ff
05
78
95
04
08
incl
0x8049578
8048385: eb 0a
jmp
8048391 <foo+0x1e>
8048387: c7 05
78
95
04
08
00 movl
$0x0,0x8049578
804838e: 00 00
00
8048391: c9
leave
8048392: c3
ret
8048393: 90
nop
Box 4.
080488da <foo>:
80488da:
55
push
%ebp
80488db:
89
e5
mov
%esp,%ebp
80488dd:
e8
62
fd
ff
ff
call
8048644 <mcount@plt>
80488e2:
83
3d
00
a2
04
08
09
cmpl
$0x9,0x804a200
80488e9:
7f
16
jg
8048901 <foo+0x27>
80488eb:
ff
05
00
a2
04
08
incl
0x804a200
80488f1:
83
05
38
a2
04
08
01
addl
$0x1,0x804a238
80488f8:
83
15
3c
a2
04
08
00
adcl
$0x0,0x804a23c
80488ff:
eb
18
jmp
8048919 <foo+0x3f>
8048901:
c7
05
00
a2
04
08
00
movl
$0x0,0x804a200
8048908:
00
00
00
804890b:
83
05
40
a2
04
08
01
addl
$0x1,0x804a240
8048912:
83
15
44
a2
04
08
00
adcl
$0x0,0x804a244
8048919:
c9
leave
804891a:
c3
ret
The generated assembly code (x86) with in-
strumentation is shown in Box 4.
Profiling data that is collected through the
profiling counters is written to a data file ( gmon.
out ). This data can be inspected later using the
GNU gprof tool. Summarized data includes
basic control flow graph information and timing
information between measure points in code. The
overhead incurred through this type of pro filing
can be significant (over 60%) primarily because the
instrumentation works on an “all or nothing” basis.
Table 4 shows experimental results measuring the
performance impact of the GNU GCC profiling
features. Tests were performed by running the
BYTEmark benchmark program (Grehan, 1995)
on a 3.00 GHz Intel Pentium-D running Redhat
Enterprise Linux v4.0. It is possible, however, to
The first highlighted ( 80488dd ) block repre-
sents a call to the profiling library's mcount()
function. The mcount() function is called by
every function and records in an in-memory call
graph table a mapping between the current func-
tion (given by the current program counter) and
the function's parent (given by return address).
This mapping is typically derived by inspect-
ing the stack. The second highlighted block
( 80488f1 ) contains instructions that increment
counters for each of the basic blocks (triggered
by the -ftrace-arcs option).
Search WWH ::




Custom Search