Subroutines and Modules Part 1 (PIC Microcontroller)

Good software should be configured as a set of interacting modules rather than one large program working straight through from beginning to end. There are many advantages to modular programming, which is almost mandatory when code lengths exceed a few hundred lines or when a project is being developed by a team.

What form should should such modules take? In order to answer this question we will look at the use of program structures designed to facilitate this modular approach and the instructions associated with it. After completing this topic you will:

• Appreciate the need for modular programming.

• Have an understanding of the structure of the stack and its use in the call-return subroutine and interrupt mechanisms.

• Understand the term nested subroutine.

• See how parameters can be passed to a subroutine, by copy or reference, and altered or returned to the caller.

• Be able to write a subroutine having a minimal impact on its environment.

• Be able to synthesize a software stack to open and close a frame in the Data store to pass parameters and provide a temporary workspace.

Take a look at the inside of your personal computer. It will probably look something like the photograph in Fig. 6.1, with a motherboard hosting the MPU, assorted memory and other support circuitry, and a variable number of expansion sockets. Into this will be plugged a disk controller card and a video card. There may be others, such as a soundboard, modem or network card. Each of these plug-in cards has a distinct and separate logical task and they interact via the services supplied by the main board – the motherboard.


There are many advantages to this modular construction.

• Flexibility; that is it is relatively easy to upgrade or reconfigure by adding or replacing plug-in cards.

• Can reuse from previous systems.

• Can buy in standard boards or design specialist boards in-house.

• Easy to maintain.

Modular hardware implementing a PC.

Fig. 6.1 Modular hardware implementing a PC.

Of course there are a few disadvantages. A fully integrated motherboard is smaller and potentially cheaper than an equivalent mother/daughterboard configuration. It is also likely to be more reliable, as input and output signals do not have to traverse sockets/plugs. However, when they do occur, faults are often more difficult to track down and rectify.

Modular programming uses the same principle to construct “software circuits”, i.e. programs. A formal definition of modular programming1 is:

An approach to programming in which separate logical tasks are programmed separately and joined later.

Thus to write a program in a modular fashion we need to decompose the specification into a number of stand-alone routines, each implementing a well-defined task. Such a module should be relatively short, be well documented and easy for a human, not necessarily the original programmer, to understand.

The advantages of a modular program are similar to those for modular hardware, but even more compelling:

• Each module can be tested, debugged and maintained on a stand-alone basis. This makes for overall reliability.

• Can be reused from previous projects or bought in from outside.

• Easier to update by changing modules.

Deciding how to segment a program into individual stand-alone tasks is where the real expertise lies. The actual coding of such tasks as subprograms is no different than the examples we have given in previous topics, such as that shown in Program 5.11.There are a few additional instructions associated with such sub-programs, and these are listed in Table 6.1. We will look at these and some useful techniques in constructing software in the remainder of the topic.

Table 6.1: Subroutine and interrupt handling instructions.

Operation Mnemonic Description
Call Transfer to subroutine
Call subroutine call aaa Push PC on to stack, PC <- <aaa>
Return Transfer back to caller
from subroutine return Pull original PC back from Stack
retlw Put literal in W and return as above

Program modules may be entered by calling from other software or by a hardware event external to the processor. This may be a voltage at one of the processor pins or an internal peripheral interface wanting service, such as the Timer overflowing. In the former case modules at assembly level are universally known as subroutines, as they are in some high-level languages such as FORTRAN and BASIC.2 In the latter they are classified as interrupt service routines or interrupt handlers. The techniques for writing these interrupt modules and their entry and exit techniques are sufficiently different to warrant a separate treatment in topic 7. Here we will look at subroutines.

Subroutines are the analog of hardware plug-in cards. Consider the situation where a 0.1 second (100 ms) delay task is to be implemented. This may be needed to alert an aircraft pilot to look at the control panel warning lights for various scenarios (such as low fuel or overheating) by sounding a buzzer for a short time. In a modular program, this delay would be implemented by coding a 100 ms subroutine which would be called by the main program as necessary. This is represented diagram-matically in Fig. 6.2.

In essence, calling up a subroutine involves nothing more than placing the address of the first subroutine instruction in the Program Counter (PC); that is doing a goto. Thus, if our delay subroutine were located at 400h, then goto 400h would seem to do the trick. Assuming the programmer has labelled the subroutine entry point, usually the first instruction, DELAY_100MS, as in Program 6.1, then we have goto DELAY_100MS.

Subroutine calling.

Fig. 6.2 Subroutine calling.

The problem really is how to get back again! Somehow the MPU has to remember from where in the caller program the subroutine was entered so that it can return to the next instruction in the caller’s sequence. This can be seen in Fig. 6.2, where the jumping-off point can be from anywhere in the main program, or indeed from another subroutine – the latter process is called nesting – see Fig. 6.4.

One possibility is to place this address in a designated Address register or memory location prior to jumping off. This can then be moved back into the PC at the end of the subroutine as the return mechanism. This approach breaks down whenever one subroutine wishes to call another. Then the secondary subroutine will overwrite the return address of the first, and the main program can never be regained. To get around this problem, more than one register or memory location could be used to hold a stack of return addresses. This last-in first-out stack structure is shown in Fig. 6.3(a).

The 14-bit core PICs have a stack of eight 13-bit registers which are exclusively used to hold subroutine return addresses.3 This structure, shown in Fig. 6.3, is known as a hardware stack. This stack is outside the PIC’s normal memory map, so its contents cannot be altered by any normal process.4

Associated with this stack is a 3-bit counter which points to the next available register in the stack. This Stack Pointer (SP) cannot be explicitly altered by any instruction but it is automatically incremented each time a call instruction is executed. call is similar to a goto instruction, but before the specified instruction address is put into the Program Counter the current value of PC is pushed into the stack. This is the address of the instruction after the call instruction, as the PC has already been incremented and the PIC is fetching this next instruction into the pipeline at the same time as the call instruction is being executed.

Using the hardware stack hold return addresses.

Fig. 6.3 Using the hardware stack hold return addresses.

In Fig. 6.3(b) the situation is shown after a call to a subroutine labelled DELAY_100MS. The execution sequence of this call DELAY_100MS is:

1. Copy the 13-bit contents of the PC into the stack register pointed to by the Stack Pointer. This will be the address of the instruction following the call instruction.

2. The Stack Pointer is decremented.

3. The destination address DELAY_100MS, that is the location of the entry point instruction of the subroutine, overwrites the original state of the PC. Effectively this causes the program execution to transfer to the subroutine.

Apart from the pushing of the return address into the stack in steps 1 and 2, call acts exactly like a plain goto. Thus call requires two bus cycles for execution as the pipeline needs to be flushed to remove the next caller instruction which is already in situ. This similarity also applies to the extension of its 11-bit absolute address in the call instruction word to a 13-bit Program store address using bits 3 & 4 of the PCLATH register, as shown in Fig. 5.4. Far calls to subroutines located between 07FFh and 1FFFh are not required for the PIC16F84 and the majority of other 14-bit core PICs that have Program store capacities of not more than 2048 instructions.

The exit point from the subroutine should be a return instruction. This reverses the push action of call and pulls the return address back from the stack into the PC – as shown in Fig. 6.3(c). The execution sequence of return is:

1. Decrement the Stack Pointer.

2. Copy the address in the stack register pointed to by the Stack Pointer into the Program Counter.

Thus no matter whence the subroutine was called form it will return to the instruction just past the original call instruction when the subroutine has been completed.

Retlw (Return Literal value in W)5 is similar to the plain return instruction but places the specified constant byte in the Working register. Thus retlw -1 could be used to return with W set to FFh (-1 decimal) to, say, indicate an error situation. Both Return instructions flush the pipeline and therefore take two bus cycles to execute.

Nested subroutines.

Fig. 6.4 Nested subroutines.

The beauty of the stack mechanism is its handling of nested subroutines. Consider the situation in Fig. 6.4 where the main program calls the first-level subroutine SR1 which in turn calls the second-level subroutine SR2. In order eventually to get back to the main program, the outward progression sequence must be exactly matched by the inward path. This pattern is matched by the last-in first-out (LIFO) structure of the stack mechanism, which can handle any arbitrary nesting sequence to any depth of eight subroutines automatically. It can even handle the (painful) situation where a subroutine calls itself! Such a subroutine is known as recursive. As we shall see in the next topic 7, the stack mechanism is also used to handle interrupts. Thus, in a system using both subroutines and interrupts the nesting depth will be somewhat less. The technique is so useful that virtually all MPU/MCUs support subroutines in this manner.

As the stack-Stack Pointer mechanism is part of the PICs hardware and requires no initialization, from the programmer’s perspective only the following points are relevant:

• The subroutine should be invoked using the call instruction.

• The entry point to a subroutine should be labelled, and this label is then the name of that subroutine.

• The exit point from the subroutine should be either return or retlw, with the latter being used when a known constant is to be in the Working register on return – see Program 6.4.

As an example, let us code the 0.1 second (100 ms) delay subroutine of Fig. 6.2. Creating a delay in software is simply a matter of doing nothing for the appropriate duration. A common way of doing this is to count down an initial constant to zero. By choosing an appropriate constant, the delay can be tailored to the desired value. Obviously, this delay will depend on the PIC’s oscillator rate. For the examples in this topic we will assume a clock rate of 4 MHz, giving a bus cycle of 1 ^s. Consider the single-byte count code fragment:

tmp18360_thumb

This continually decrements the initial setting N of the register file labelled COUNT, dropping out when it reaches zero. The total delay is contributed by both instructions as:

1. The decfsz takes one cycle to execute, except when N reaches zero, in which case two cycles are needed as the pipeline needs to be flushed. This gives a total execution time of [(N - 1) x 1] + 2.

2. As the loop exits by skipping over the goto instruction, this is only executed N – 1 times; each time taking two bus cycles. This contribution is thus (N – 1) x 2.

The total delay is thus:

tmp18361_thumb

The maximum delay is when COUNT is initialized to zero which gives an effective value for N of 256.6 In this situation the number of cycles in this code fragment is 767, plus the few cycles calling the subroutine (2″), clearing COUNT (1 ~) and returning from the subroutine (2″), giving a total of 772 With a clock rate of 4 MHz.

Program 6.1 A 100 ms delay subroutine.

Program 6.1 A 100 ms delay subroutine.

Program 6.1 uses two register files to extend this count. Whenever COUNT_L has decremented to zero, COUNT_H is decremented. This inner loop takes (256 x 3) – 1 cycles in the normal way and is repeated H times, where H is the initial setting of COUNT_H. Only when this second byte reaches zero is the outer loop exited and execution returned to the caller. This second count contributes (H x 3) – 1 cycles to the total. We thus need to determine the unknown value of H to give a total of 100,000 – 7 cycles, taking into account the call, return and the three setting up instructions.

tmp18363_thumb

This gives an actual delay of 100.107ms; an error of less than 0.11%. Each incremental change in H gives an alteration of ±770 ^s.

The maximum delay available with this structure is 197 ms; however, adding one or more nop instructions at the beginning of the inner loop will increase the total delay by H x 256 cycles and the same technique used in the inner loop adds H cycles for fine tuning. Example 6.3 gives an example of a very long triple count delay subroutine.

Our 100 ms delay program is an example of a double-void subroutine, in that no parameters (cf. signals in our hardware analog) are sent to it and nothing is returned – just the side effect of a delay (and the alteration of two register files, W and some Status register flags). Most subroutines process parameters made available at entry time and provide data at return time.

As a simple example, consider the extension of Program 6.1 to give a delay of K x 100 ms, where K is a byte parameter ‘sent’ by the caller. The system view of this function is shown in Fig. 6.5 as a single input signal of range 1- 256, with no output signal – that is with a void output. This diagram also documents the location of all local variables used internally by the subroutine. This latter attribute is useful in checking for multiple usage of a file register between different subroutines and callers. Notice the double line vertical borders commonly used in flow diagrams to denote modules or subroutines.

System view of K x 100 ms delay subroutine.

Fig. 6.5 System view of K x 100 ms delay subroutine.

As there is only one input byte-sized parameter, the most convenient place to place K is in the Working register. Thus to call up a 5-second delay, the caller could use the sequence:

tmp18365_thumb

The actual subroutine itself in Program 6.2 implements the task list:

1. DO:

(a) Delay 100ms.

(b) Decrement K.

2. WHILE (K > 0).

3. End.

The actual coding simply copies the parameter from W into File 32h before entering the following delineated coding, which is identical to Program 6.1 and gives a single 100 ms delay. On completion of the delay K is decremented in situ and the delay block repeated until K reaches zero. Thus the 100 ms block is repeated K times.

Program 6.2 A K x 100 ms delay subroutine.

Program 6.2 A K x 100 ms delay subroutine.

Next post:

Previous post: