February
1998, Issue 91
Low-Cost
Voice Recognition
TINY
VOICE
My
system—Tiny Voice—is based on a low-cost, 20-pin single-chip
controller. It’s a speaker-dependent, template-based,
isolated-word recognizer. You train it to recognize
your voice.
Up
to 16 voice patterns are stored in a nonvolatile 512-byte
serial EEPROM. Five push buttons enable programming
and operation, and seven LEDs give status.
For
embedded systems, Tiny Voice can be controlled over
a parallel or serial protocol from a host microcontroller
or it can run stand-alone. The source code may be modified
to fit your requirements.
At
under $5, Tiny Voice won’t do dictation. But, it’s good
for applications like toys, repertory phone dialers,
voice-activated padlocks, security systems, remote controls,
and other low-cost consumer products.
A
voice command can be one or several words, with a total
maximum length of 1.6 s and a minimum of 0.2 s. Response
time is typically <100 ms. By carefully selecting
the vocabulary and context, over 95% recognition accuracy
is possible.
The
heart of the system is the 68HC705J1A Motorola 8-bit
processor. There were a number of reasons why I chose
this part over a comparable one from Zilog or Microchip.
There’s
sufficient RAM (64 bytes) to buffer the input waveforms
and hold template structures, and its 1240 bytes of
ROM provide enough program storage. Also, interrupts
are supported, including changes on the I/O lines.
This
system is inexpensive (<$2) in high volume. The development
kit is cheap, too, at $99.
Shown
in Photo 1, the Tiny Voice system was built on a 3˛ × 3˛ breadboard and is powered off a 9-V battery. Standby
current consumption is ~2 mA, which is primarily due
to the op-amp and electret microphone bias.
|

(Click here to enlarge)
|
Photo
1—My prototype was built on a 3˛ × 3˛
breadboard and is powered off a 9-V battery. The
only ICs are the 68HC705J1 processor, LM358 dual-operational
amplifier, the 4096-bit FM24C04 FRAM serial memory,
and a 78L05 5-V regulator. |
With
some added power management, standby current could be
reduced to a few microamps. Operating power while sampling
and analyzing speech is ~10 mA.