Hardware Reference
In-Depth Information
A.11 [5] <A.3> Consider a C struct that includes the following members:
struct foo {
char a;
bool b;
int c;
double d;
short e;
float f;
double g;
char * cptr;
float * fptr;
int x;
};
For a 32-bit machine, what is the size of the foo struct? What is the minimum size required
for this struct, assuming you may arrange the order of the struct members as you wish?
What about for a 64-bit machine?
A.12 [30] <A.7> Many computer manufacturers now include tools or simulators that allow
you to measure the instruction set usage of a user program. Among the methods in use are
machine simulation, hardware-supported trapping, and a compiler technique that instru-
ments the object code module by inserting counters. Find a processor available to you that
includes such a tool. Use it to measure the instruction set mix for one of the SPEC CPU2006
benchmarks. Compare the results to those shown in this chapter.
A.13 [30] <A.8> Newer processors such as Intel's i7 Sandy Bridge include support for AVX
vector/multimedia instructions. Write a dense matrix multiply function using single-preci-
sion values and compile it with different compilers and optimization flags. Linear algebra
codes using Basic Linear Algebra Subroutine (BLAS) routines such as SGEMM include op-
timized versions of dense matrix multiply. Compare the code size and performance of your
code to that of BLAS SGEMM. Explore what happens when using double-precision values
and DGEMM.
A.14 [30] <A.8> For the SGEMM code developed above for the i7 processor, include the use
of AVX intrinsics to improve the performance. In particular, try to vectorize your code to
better utilize the AVX hardware. Compare the code size and performance to the original
code.
A.15 [30] <A.7, A.9> SPIM is a popular simulator for simulating MIPS processors. Use SPIM
to measure the instruction set mix for some SPEC CPU2006 benchmark programs.
A.16 [35/35/35/35] <A.2-A.8> gcc targets most modern instruction set architectures (see
www.gnu.org/software/gcc/ ) . Create a version of gcc for several architectures that you
have access to, such as ×86, MIPS, PowerPC, and ARM.
a. [35] <A.2-A.8> Compile a subset of SPEC CPU2006 integer benchmarks and create a
table of code sizes. Which architecture is best for each program?
b. [35] <A.2-A.8> Compile a subset of SPEC CPU2006 floating-point benchmarks and
create a table of code sizes. Which architecture is best for each program?
c. [35] <A.2-A.8> Compile a subset of EEMBC AutoBench benchmarks (see
www.eembc.org/home.php ) and create a table of code sizes. Which architecture is best
for each program?
d. [35] <A.2-A.8> Compile a subset of EEMBC FPBench floating-point benchmarks and
create a table of code sizes. Which architecture is best for each program?
Search WWH ::




Custom Search