Memory Hierarchy Design - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

for (i = 0; i < N; i = i+1)

for (j = 0; j < N; j = j+1)

{r = 0;

for (k = 0; k < N; k = k + 1)

r = r + y[i][k]*z[k][j];

x[i][j] = r;

};

The two inner loops read all N -by- N elements of z , read the same N elements in a row of y re-

peatedly, and write one row of N elements of x . Figure 2.8 gives a snapshot of the accesses to

the three arrays. A dark shade indicates a recent access, a light shade indicates an older access,

and white means not yet accessed.

FIGURE 2.8 A snapshot of the three arrays x , y , and z when N = 6 and i = 1 . The age of

accesses to the array elements is indicated by shade: white means not yet touched, light

means older accesses, and dark means newer accesses. Compared to Figure 2.9 , elements

of y and z are read repeatedly to calculate new elements of x . The variables i , j , and k are

shown along the rows or columns used to access the arrays.

The number of capacity misses clearly depends on N and the size of the cache. If it can hold

all three N -by- N matrices, then all is well, provided there are no cache conflicts. If the cache can

hold one N-by-N matrix and one row of N , then at least the i ith row of y and the array z may stay

in the cache. Less than that and misses may occur for both x and z . In the worst case, there

would be 2 N 3 + N 2 memory words accessed for N 3 operations.

To ensure that the elements being accessed can it in the cache, the original code is changed

to compute on a submatrix of size B by B . Two inner loops now compute in steps of size B rather

than the full length of x and z . B is called the blocking factor . (Assume x is initialized to zero.)

/* After */

for (jj = 0; jj < N; jj = jj+B)

for (kk = 0; kk < N; kk = kk+B)

for (i = 0; i < N; i = i+1)

for (j = jj; j < min(jj+B,N); j = j+1)

{r = 0;

for (k = kk; k < min(kk+B,N); k = k + 1)

r = r + y[i][k]*z[k][j];

x[i][j] = x[i][j] + r;

Search WWH ::

Custom Search

Home