Pipelining: Basic and Intermediate Concepts - Computer Architecture: A Quantitative Approach

Hardware Reference

In-Depth Information

A Simple Implementation Of MIPS

In this section we follow the style of Section C.1 , showing first a simple unpipelined imple-

mentation and then the pipelined implementation. This time, however, our example is specific

to the MIPS architecture.

In this subsection, we focus on a pipeline for an integer subset of MIPS that consists of load-

store word, branch equal to zero, and integer ALU operations. Later in this appendix we will

incorporate the basic floating-point operations. Although we discuss only a subset of MIPS,

the basic principles can be extended to handle all the instructions. We initially used a less ag-

gressive implementation of a branch instruction. We show how to implement the more ag-

gressive version at the end of this section.

Every MIPS instruction can be implemented in at most 5 clock cycles. The 5 clock cycles are

as follows:

1. Instruction fetch cycle (IF):

IR ← Mem[PC];

NPC ← PC + 4;

Operation —Send out the PC and fetch the instruction from memory into the instruction re-

gister (IR); increment the PC by 4 to address the next sequential instruction. The IR is used

to hold the instruction that will be needed on subsequent clock cycles; likewise, the register

NPC is used to hold the next sequential PC.

2. Instruction decode/register fetch cycle (ID):

A ← Regs[rs];

B ← Regs[rt];

Imm ← sign-extended immediate field of IR;

Operation —Decode the instruction and access the register file to read the registers (rs and rt

are the register specifiers). The outputs of the general-purpose registers are read into two

temporary registers (A and B) for use in later clock cycles. The lower 16 bits of the IR are

also sign extended and stored into the temporary register Imm, for use in the next cycle.

Decoding is done in parallel with reading registers, which is possible because these ields

are at a fixed location in the MIPS instruction format. Because the immediate portion of an

instruction is located in an identical place in every MIPS format, the sign-extended imme-

diate is also calculated during this cycle in case it is needed in the next cycle.

3. Execution/effective address cycle (EX):

The ALU operates on the operands prepared in the prior cycle, performing one of four

functions depending on the MIPS instruction type:

■ Memory reference:

ALUOutput ← A + Imm;

Operation —The ALU adds the operands to form the effective address and places the

result into the register ALUOutput.

■ Register-register ALU instruction:

ALUOutput ← A func B;

Operation —The ALU performs the operation specified by the function code on the

value in register A and on the value in register B. The result is placed in the temporary

register ALUOutput.

■ Register-Immediate ALU instruction:

ALUOutput ← A op Imm;

Operation —The ALU performs the operation specified by the opcode on the value in

Computer Architecture: A Quantitative Approach

Search WWH ::

Custom Search

Home