In Depth Tutorials and Information

Subroutines and Modules Part 3 (PIC Microcontroller)

As an example, consider a stack-oriented version of the multiplication subroutine of Program 6.5. A view of the software stack from the perspective of this new coding is shown in Fig. 6.8. Based on this diagram, in order to call up this subroutine the following procedure has to be implemented:

1. Push the Multiplicand and then Multiplier into the stack frame and call the subroutine.

2. Push zero into the next byte in the frame, which is being used for local storage.

3. Push zero out twice more to create an initialized hole for the two bytes to return the product.

The following code fragment shows how item 1 above is coded.

(a) Transfer the contents of the Pseudo Stack Register to the FSR. This means that the FSR now points to the top of the new stack frame. If the subroutine is a first-level call (that is not nested from another subroutine) then this will be 2Fh in our example.

Fig. 6.8 The stack frame viewed from the perspective of subroutine MUL_S.

(b) Copy the Multiplicand from memory (we assume it is at File 46h as in our last example) into W and then indirectly into the frame using INDF as the target. Decrementing the FSR completes the push action.

(c) In a similar manner, the Multiplicand is pushed into the stack.

(d) Call the subroutine.

Coding of the subroutine MUL_S is given in Program 6.6. This implements items 2-4 of Fig. 6.8. Firstly, MULTIPLICANDS, the temporary storage of the multiplicand overflow, is zeroed and then zero is pushed into the next two locations to create the initial value for the product. In item 4 the Pseudo Stack Pointer is reset to point to the next free byte below the frame. In this way, should the subroutine wish to call another, then there will be a new frame available for that next-level storage with the new TOF beginning just below the old frame. These two instructions may be omitted in this case as there are no further nested call outs, although Task3C will have to be altered. This is done in Example 6.6.

The core of the subroutine, that is Task 3, is similar to Program 6.5 but the FSR has to be moved up and down the frame to access the appropriate level. The only non-obvious use of the FSR is at Task 3C. As there are two ways into this routine, depending on whether the shifted multiplicand is added to the product or not, the state of the FSR is unknown. It can however be reset from the PSP which is pointing to just below the frame at this point. By adding five to this PSR value, the FSR will always point to MULTIPLICAND.

Finally the subroutine ‘cleans up’ the stack by updating the Pseudo Stack Pointer to its previous value. In this case this is done by adding five, but in general by adding the frame depth n.

Program 6.6 requires 45 instructions as compared to 20 in Program 6.5. Its worst-case execution time of 274 cycles also compares unfavorably with 142 cycles. Thus in all respects except reusability and robustness this stack-based model is clearly inferior. It may be more economical in

Program 6.6 Implementing a byte multiply using a stack model.

its use of scarce data memory in large software systems. However, in programs running on low and mid-range PICs are often not very complex. Furthermore the small Program memory (1024 instructions for the PIC16F84) may further restrict the use of this relatively extravagant technique. Where real-time execution time is critical the additional burden of stack handling is unlikely to be worthwhile.

Examples

Example 6.1

The binary series approximation to the fractionis:

Using this series, write a subroutine that will divide a byte in the Working register by three with the quotient being returned in the same W register. The actual value of the summation up to y|g is 0.3359375, which is within 0.78% of the exact value. With an 8-bit datum there is no point in including any further elements in the series, but where 16-bit operands are being processed then further elements up to the desired accuracy can be summed in the same manner.

Solution

The fractionscan be easily generated by shifting right.

The coding listed in Program 6.7 simply repetitively shifts the number as copied to a temporary location in register file File 33h right. Notice that it is necessary to clear the Carry flag before each shift as both the previous Rotate and Subtract instructions can alter its state. As the subwf instruction subtracts the contents of W (the sub quotient) from the datum in the specified file register then the outcome being built up in W will oscillate in sign as desired. In situations where the series element signs are not so regular then a further file register must be used to build up the quotient. The coding in Program 6.7 can be considered as the shift and subtract analog of the shift and add process outlined in Program 6.5 above. Execution takes 27 cycles including call and return. It can easily be extended to generate the fraction 3 by omitting the first shift right, thereby effectively multiplying the series by two. The rest of the program remains the same.

Program 6.7 Dividing by three

Example 6.2

Write a subroutine to give a fixed 208 us delay. Assume a 4 MHz processor clock rate.

Solution

For a short time period like this the code fragment of adequate delay. The solution shown in Program 6.8 is identical to this coding terminated by a return instruction.

Program 6.8 Coding a 208 us delay.

In order to calculate the parameter the time equation is:

2(call) + 1 + 1 + (N x 3 – 1) + 2 = 208us Nx 3 = 203 N = 67

This gives a total delay of 206 us. Adding two nop instructions just before the return instruction will add the two extra us.

Example 6.3

At the other end of the spectrum write a subroutine to give a delay of one second.

Solution

For a delay as long as this we need to extend Program 6.1 to use a larger count. In the coding of Program 6.9 three file registers are used to give a triple loop.

File register COUNT2 is initialized to the value H whilst the other two file registers are cleared giving an effective count range of 256. The outer loop exits the program when COUNT2 reaches zero. This single pass therefore contributes (H x 3) – 1 cycles to the total delay. Each pass through the COUNT1 loop takes (256 x 3) – 1 cycles and there are H passes giving a contribution of H x [(256 x 3) - 1] cycles. In the same manner the inner loop based on decrementing COUNT0 runs H x 256 times each pass contributing (256 x 3) – 1 cycles to the total. Thus we have a total delay of:

Equating this to 106 (^s per second) gives H ^ 5. The actual delay with an H of 5 is 0.985615 s which is accurate to 1.4%. If desired the shortfall of 14,385 cycles can be made up by adding nop instructions to the middle count loop. Each such instruction gives an extra 1280 cycles. The maximum delay possible with this program is 50.46 s for H = 0 (effectively 256).

Program 6.9 A 1-second delay program.

Example 6.4

Design a subroutine to convert a binary byte passed in W to a 3-digit BCD equivalent in HUNDREDS (File 30h), TENS (File 31 h) and UNITS (File 32h).

Solution

We have already coded a routine to implement this mapping in Example 5.3. However this was restricted to a range 0-99, that is two digits. Nevertheless we can extend the technique used there by first subtracting and counting hundreds from the original binary byte. After this has been computed then the residue will be less than 100 and the rest of the coding will be the same, as shown in Program 6.10. Thus a suitable task list would be:

1. Divide by 100; the remainder is the hundreds digit.

2. Divide the quotient by ten; the remainder is the tens digit.

3. The quotient is the units digit.

Program 6.10 Binary to 3-digit BCD conversion.

Next post: Subroutines and Modules Part 4 (PIC Microcontroller)

Previous post: Subroutines and Modules Part 2 (PIC Microcontroller)

Subroutines and Modules Part 3 (PIC Microcontroller)

Examples

Example 6.1

Solution

Example 6.2

Solution

Example 6.3

Solution

Example 6.4

Solution

Related Links

:: Search WWH ::