circuitcellar.com
Magazine Support   Digital Library   Products & Services   Suppliers Directory 
 
 





 

January 2006, Issue 186

Third-Generation Rabbit
A Look at the Rabbit 4000


SPEEDING UP THE BUS

The maximum clock speed has increased with each new Rabbit and has gotten to the point where it’s starting to be expensive to find memory that can be used with zero wait states. One obvious solution to this problem was to increase the width of the data bus.

A 16-bit bus had to be an option and not a requirement, though, because the majority of Rabbit 4000 designs will still use 8-bit memories for cost reasons. This is what makes life interesting for me as a designer.

The 16-bit bus uses one of the parallel ports for the extra 8 bits of data to keep the package pin count low. Supporting byte writes on a 16-bit bus often leads to a bit of external glue logic. I wanted to avoid this, so a separate port pin can be programmed to provide the necessary byte-lane control signal for a glueless memory interface.

Rabbit wanted the option for 16-bit memories on two of the device’s three chip selects, including the chip select used as the default after reset. Because the default is an 8-bit bus, this means that the first few bytes of code in a 16-bit memory connected to this chip select must be capable of switching the processor to 16-bit mode.

An additional complication was that the code must execute identically whether it’s in 16-bit mode or 8-bit mode (after a reset). This limited me to using only pairs of 1-byte instructions. This restriction comes about because in 8-bit mode the CPU will actually be fetching the same instruction twice on an 8-bit bus until the switch to 16-bit mode occurs (see Listing 1).

The code first builds 0x02, which is the data value that enables 16-bit operation and stores it in B. Then, the I/O address of 0x1D is built in L. I then take advantage of the fact that multiple I/O prefixes (the IOI) are interpreted as one to do the actual I/O write. The first write is to an I/O register, but the second write is going to go to memory. But this is fine because memory writes are disabled after reset. The two NOPs allow time for the actual switch, and away we go with a 16-bit bus!

Just providing a 16-bit bus doesn’t improve performance unless the processor can actually use all 16 bits at once. Although major changes have been made to the register architecture and instruction set, the CPU itself still takes in instructions 1 byte at a time. So along with the 16-bit bus comes a 3-byte prefetch queue.

The prefetch mechanism is coupled with the instruction execution, although wait states when prefetching don’t slow down execution. Instead, the prefetch runs semi-autonomously, attempting to always keep at least 1 byte in the prefetch queue. But when the execution unit knows that a write operation is coming up, it notifies the prefetch not to start any new reads that might slow down the write. In a similar fashion, when a branch instruction is recognized, the prefetch is notified to stop when the branch address has been completely fetched into the queue. All of this leads to a measurable performance gain with a 16-bit bus.

The final performance option is support for 16-byte Page mode. Page mode memories are capable of supplying data in the same page much faster than for ordinary random accesses. However, Page mode memory requires that both the chip select and the output enable remain active with only the address changing to take advantage of the faster access time. The Rabbit 4000 supports this Page mode operation on either an 8-bit bus or a 16-bit bus. It includes separate wait state values for initial and subsequent memory reads. Separate control bits are available for each chip select.