circuitcellar.com
Magazine Support   Digital Library   Products & Services   Suppliers Directory 
 
 





 

April 2000, Issue 117

Building a RISC System In AN FPGA Part 1:
Part 2: Pipeline and Control Unit Design by Jan Gray


by Jan Gray

Start Pipelined Execution Memory Accesses Branching Out Interrupts Control Unit Design Decode Stage The Execute Stage PDF

INTERRUPTS

When an interrupt request occurs, you must jump to the interrupt handler, preserve the interrupt return address, retire the current pipeline, execute the handler, and later return to the interrupted instruction.

When INTREQ is asserted, you simply override the fetched instruction with int, that is, jal r14,10(r0) via the IRMUX. This jumps to the interrupt handler at 0x0010 and leaves the return address in r14, which is reserved for this purpose. When the handler has completed, it executes iret, (i.e, jal r0,0(r14)) and exection resumes with the interrupted instruction.

There are two pipeline issues here. First, you must not interrupt an interlocked instruction sequence (any add, sub, shift, or imm followed by another instruction). If an interlocked instruction is in the DC stage, the interrupt is deferred one cycle.

Secondly, the int must not be inserted in a branch or jump shadow, lest it be annulled. If a branch or jump is in the DC stage, or if a taken branch or jump is in the EX stage, the interrupt is deferred.

The simplicity of the process pays off once again. The time to take an interrupt and then return from a null interrupt handler is only six cycles.

You might be wondering about the interrupt priorities, non-maskable interrupts, nested interrupts, and interrupt vectors. These artifacts of the fixed-pinout era need not be hardwired into our FPGA CPU. They are best done by collaboration with an on-chip interrupt controller and the interrupt handler software.

The last pipeline issue is DMA. The PC/address unit doubles as a DMA engine. Using a 16 × 16 RAM as a PC register file, you can fetch either an instruction (AN ? PC0 += 2) or a DMA word (AN ? PC1 += 2) per memory cycle.

After an instruction is fetched, if DMAREQ has been asserted, you insert one DMA memory cycle.

This PC register file costs eight CLBs for the RAM, but saves 16 CLBs (otherwise necessary for a separate 16-bit DMA address counter and a 16-bit 2-1 address mux), and shaves a couple of nanoseconds from the system’s critical path. It’s a nice example of a problem-specific optimization you can build with a customizable processor.

To recap, each instruction takes three pipeline cycles to move through the instruction fetch, operand fetch and decode, and execute pipeline stages. Each pipeline cycle requires up to three memory access cycles (mandatory instruction fetch, optional DMA, and optional EX stage load or store). Each memory access cycle requires one or more clock cycles.