Digital Signal Processing Reference
In-Depth Information
3.19.2 Trip Directive for Loop Count
The linear assembly directive .trip is used to specify the number of times a loop
iterates. If the exact number is known and used, the linear assembler optimizer can
produce pipelined code (discussed in Chapter 8) and redundant loops are not gen-
erated. This can improve both code size and execution time. A .trip count speci-
fication, even if it is not the exact value, may improve performance: for example,
when the actual number of iterations is a multiple of the specified value. The intrin-
sic function _nassert() can be used in a C program in lieu of .trip . Example
3.1 illustrates the use of _nassert() in the dot product example.
3.19.3 Cross-Paths
Data and address cross-path instructions are used to increase code efficiency. The
instruction
MPY .M1x A2,B2,A4
illustrates a data cross-path that multiplies the two sources A2 and B2 from two dif-
ferent sides, A and B, with the result in A4. If the result is in the B register file, a 2x
cross-path is used with the instruction
MPY .M2x A2,B2,B4
with the result in B4. The instruction
LDW .D1T2 *A2,B2
illustrates an address cross-path. It loads the content in register A2 (from a register
file A) into register B2 (register file B). Only two cross-paths are available on the
C6x, so no more than two instructions using cross-paths are allowed within a cycle.
3.19.4 Software Pipelining
Software pipelining uses available resources to obtain efficient pipelining code. The
aim is to use all eight functional units within one cycle. However, substantial coding
effort can be required when the software pipelining technique is used for more
complex programs. There are three stages to a pipelined code:
1. Prolog
2. Loop kernel (or loop cycle)
3. Epilog
Search WWH ::




Custom Search