Graphics Reference
In-Depth Information
Because this code could be called as in either one of the three calls
float f[3];
ComplexMult(&f[0], &f[0], &f[0]);
// a[0] aliases b[0], c[0]; a[1] aliases b[1], c[1]
ComplexMult(&f[1], &f[0], &f[0]);
// a[0] aliases b[1], c[1]
ComplexMult(&f[0], &f[1], &f[1]);
// a[1] aliases b[0], c[0]
it is clear that a[0] and a[1] might alias any of b[0] , b[1] , c[0] , and c[1] . The
assignment to a[0] therefore means that all four input values must be fetched again
in the second code line. To better illustrate the issue, the following compiler-generated
code (here compiled with a version of gcc producing MIPS assembly code) shows that
this is indeed the case.
ComplexMult(float *, float *, float *)
lwc1
f3,0x0000(a2)
; f3 = c[0]
lwc1
f2,0x0004(a2)
; f2 = c[1]
lwc1
f1,0x0000(a1)
; f1 = b[0]
lwc1
f0,0x0004(a1)
; f0 = b[1]
mul.s
f1,f1,f3
;f1=f1*f3
mul.s
f0,f0,f2
;f0=f0*f2
sub.s
f1,f1,f0
;f1=f1-f0
swc1
f1,0x0000(a0) ;a[0]=f1
lwc1
f2,0x0004(a1)
; f2 = b[1] (reloaded)
lwc1
f3,0x0000(a2)
; f3 = c[0] (reloaded)
lwc1
f0,0x0000(a1)
; f0 = b[0] (reloaded)
lwc1
f1,0x0004(a2)
; f1 = c[1] (reloaded)
mul.s
f2,f2,f3
;f2=f2*f3
mul.s
f0,f0,f1
;f0=f0*f1
add.s
f0,f0,f2
;f0=f0+f2
jr
ra
swc1
f0,0x0004(a0) ;a[1]=f0
However, by restrict-qualifying a (and to help the compiler with its alias analysis,
also b and c ; see the next section for the explanation), the compiler is effectively
informed that neither b nor c can be aliased by a .
void ComplexMult(float * restrict a, float * restrict b, float * restrict c)
{
a[0] = b[0]*c[0] - b[1]*c[1];
// real part
a[1] = b[0]*c[1] + b[1]*c[0];
// imaginary part
}
 
Search WWH ::




Custom Search