Graphics Programs Reference
In-Depth Information
Like a row of houses on a local street, each with its own address, memory
can be thought of as a row of bytes, each with its own memory address. Each
byte of memory can be accessed by its address, and in this case the CPU
accesses this part of memory to retrieve the machine language instructions
that make up the compiled program. Older Intel x 86 processors use a 32-bit
addressing scheme, while newer ones use a 64-bit one. The 32-bit processors
have 2 32 (or 4,294,967,296) possible addresses, while the 64-bit ones have 2 64
(1.84467441 × 10 19 ) possible addresses. The 64-bit processors can run in
32-bit compatibility mode, which allows them to run 32-bit code quickly.
The hexadecimal bytes in the middle of the listing above are the machine
language instructions for the x 86 processor. Of course, these hexadecimal values
are only representations of the bytes of binary 1s and 0s the CPU can under-
stand. But since 0101010110001001111001011000001111101100111100001 . . .
isn't very useful to anything other than the processor, the machine code is
displayed as hexadecimal bytes and each instruction is put on its own line,
like splitting a paragraph into sentences.
Come to think of it, the hexadecimal bytes really aren't very useful them-
selves, either—that's where assembly language comes in. The instructions on
the far right are in assembly language. Assembly language is really just a col-
lection of mnemonics for the corresponding machine language instructions.
The instruction ret is far easier to remember and make sense of than 0xc3 or
11000011 . Unlike C and other compiled languages, assembly language instruc-
tions have a direct one-to-one relationship with their corresponding machine
language instructions. This means that since every processor architecture has
different machine language instructions, each also has a different form of
assembly language. Assembly is just a way for programmers to represent the
machine language instructions that are given to the processor. Exactly how
these machine language instructions are represented is simply a matter of
convention and preference. While you can theoretically create your own x 86
assembly language syntax, most people stick with one of the two main types:
AT&T syntax and Intel syntax. The assembly shown in the output on page 21
is AT&T syntax, as just about all of Linux's disassembly tools use this syntax by
default. It's easy to recognize AT&T syntax by the cacophony of % and $ symbols
prefixing everything (take a look again at the example on page 21). The same
code can be shown in Intel syntax by providing an additional command-line
option, -M intel , to objdump , as shown in the output below.
reader@hacking:~/booksrc $ objdump -M intel -D a.out | grep -A20 main.:
08048374 <main>:
8048374: 55 push ebp
8048375: 89 e5 mov ebp,esp
8048377: 83 ec 08 sub esp,0x8
804837a: 83 e4 f0 and esp,0xfffffff0
804837d: b8 00 00 00 00 mov eax,0x0
8048382: 29 c4 sub esp,eax
8048384: c7 45 fc 00 00 00 00 mov DWORD PTR [ebp-4],0x0
804838b: 83 7d fc 09 cmp DWORD PTR [ebp-4],0x9
804838f: 7e 02 jle 8048393 <main+0x1f>
Search WWH ::




Custom Search