Digital Signal Processing Reference
In-Depth Information
; Dotp4afunc.asm Multiply two arrays. Called from dotp4a_init.asm
;A4=x address,B4=y address,A6=count(size of array),B3=return address
.def dotp4afunc ;dot product function
.text ;text section
dotp4afunc MV A6,A1 ;move loop count -->A1
ZERO A7 ;init A7 for accumulation
loop LDH *A4++,A2 ;A2=content of x address
LDH *B4++,B2 ;B2=content of y address
NOP 4 ;4 delay slots for LDH
MPY A2,B2,A3 ;A3 = x * y
NOP ;1 delay slot for MPY
ADD A3,A7,A7 ;sum of products in A7
SUB A1,1,A1 ;decrement loop counter
[A1] B loop ;branch back to loop till A1=0
NOP 5 ;5 delay slots for branch
MV A7,A4 ;A4=result
B B3 ;return from func to addr in B3
NOP 5 ;5 delay slots for branch
FIGURE 3.18. ASM function called from an ASM program to find the sum of products
( dotp4afunc.asm ).
listing of the assembly program dotp4a_init.asm , which initializes the two
arrays of numbers and calls the assembly function dotp4afunc.asm , shown in
Figure 3.18, which takes the sum of products of the two arrays. It also sets a return
address through register B3 and the result address to A0. The addresses of the two
arrays and the size of the array are passed to the function dotp4afunc.asm
through registers A4, A6, and B4, respectively. The result from the called function
is “sent back” through A4. The resulting sum of the products is stored in memory
whose address is result_addr . The instruction STW stores the resulting sum of
the products in A4 (in memory pointed by A0). Register A0 serves as a pointer with
the address result_addr .
The instruction MVK moves the 16 LSBs (equivalent to MVKL). If a 32-bit
address (or result) is required, then the pair of instructions MVKL and MVKH can
be used to move both the lower and upper 16 bits of the address (or result). The
starting address of the calling ASM program is defined as init . The vector file is
modified and included in the folder dotp4a so that the reference to the entry
address is changed from _c_int00 to the entry address init . An alternative
vector file vectors_dotp4a.asm , as shown in Figure 3.19, specifies a branch to
that entry address. The called asm function dotp4afunc.asm calculates the sum
of products. The loop count value was moved to A1 since A6 cannot be used as a
conditional register (only A1, A2, B0, B1, and B2 can be used). The two LDH instruc-
tions load (half-word of 16 bits) the addresses of the two arrays starting at x_addr
and y_addr into registers A2 and B2, respectively. For example, the instruction
LDH *B4++,B2
Search WWH ::




Custom Search