Figure
4—These days, all hot chips employ SIMD techniques. Motorola’s AltiVec scheme
goes beyond the usual intra-element operations (e.g., vmsum instruction) and adds
inter-element operations (e.g., vsum instruction). The result—an inner loop that
requires 36 instructions and 18 cycles for a regular PowerPC is cut to two instructions
and two cycles.
