circuitcellar.com
Magazine Support   Digital Library   Products & Services   Suppliers Directory 
 
 





 

November 1998, Issue 100

Smart Rocket


by Tom Cantrell

ZIPPING IN AND OUT

The secret to SX performance is simple, relying as it does on the traditional technique of pipelining. The four-stage pipe in Figure 2 is a classic, similar to those found in earlier (but typically larger, like 32-bit) machines.

(Click here to enlarge)

Figure 2—SX performance is obtained via pipelining using a time honored fetch-decode-execute-writeback design. Since a pipeline can only run as fast as its slowest stage, great attention was paid to the flash-memory design to achieve the 10-ns access time required by the 50-MHz clock rate.

There is one, and only one, reason to use a pipeline and that’s to boost the clock rate, which ultimately is limited by memory access time.

In compatibility mode, the SX reverts to four clocks per instruction (eight for JMPs and CALLs), same as a PIC. Flip the turbo switch, and the pipeline kicks in.

Once filled, the pipe delivers close to one instruction per clock. However, as with all pipelined machines, there are some caveats to be aware of.

The JMP and CALL penalty is relatively worse due to the need to refill the pipe. Where such instructions require two cycles (eight clocks) in compatible mode, they need three cycles (three clocks) in turbo mode, derating the advantage to 2.66´ (8 divided by 3) for those instructions versus 4´ for most others.

Another example is IREAD, one of the ten new instructions added by Scenix (see Table 1). IREAD enables a program to read the instruction memory, something that’s nontrivial in a Harvard design (separate program and data memory).

Given the complication involved, IREAD requires the same number of clocks (four) in both compatibility and turbo mode. But, it’s faster than previous data-lookup schemes and can access the entire code space.

Pipelined machines are also subject to various hazards that must be obviated by hardware, software, or both (e.g., the problem of trying to read data at the same time it’s being written).

Consider a sequence of instructions involving a back-to-back write followed by a read of the same data. Instruction n is writing data (write stage) even as instruction n + 1 (execute stage) wants to read it.

With on-chip RAM, the SX includes forwarding logic that handles such an obstacle transparently in hardware. Thus, one instruction can write to RAM and the next one can safely read from the same location. However, for I/O ports, there are precautions concerning successive operations.

For pins configured as outputs, the SX reads the actual pin level, not the output latch. I think the SX approach is superior because it enables the detection of external problems such as a shorted or excessively loaded pin.

It’s easy to confirm that the output-pin level is or isn’t what it’s supposed to be. By contrast, reading the output latch, rather than the pin, leaves you blind to outside interference.

As a consequence, a write to a port may not propagate through to the pin in time to be recognized by an immediate read. Depending on the clock rate and pin loading, a non-port instruction should be inserted to split up a back-to-back port write and read. Similarly, the possible difference between output latch and pin level calls for care when using read, modify, and write instructions like SETB and CLRB.

The I/O pins themselves (4-bit Port A, 8-bit Port B, and, for 28-pin devices, 8-bit Port C) are versatile. Each pin is individually programmable as input or output, with or without an internal pull-up resistor. All inputs are selectable as TTL or CMOS levels, and Port B and Port C inputs can be individually defined as Schmitt triggered.

Outputs can sink and source 30 mA (subject to overall device power limit), with those on Port A featuring symmetrical drive (i.e., centered about VDD/2 under any load). This feature is useful for driving speakers and other pseudoanalog functions such as using a PWM to implement a DAC.

As inputs, pins of Port B can be individually enabled to act as wakeups (with programmable edge selection) from low-power sleep mode. Or, three pins of Port B can be configured as an analog comparator. Two inputs (RB1 and RB2) are compared with the result (greater than or less than) reflected on output RB3.

Besides general-purpose I/O, the SX includes an 8-bit timer/counter (RTCC) and watchdog timer, either of which (but not both at the same time) can be mated with an 8-bit prescaler.