February
1998, Issue 91
Low-Cost
Voice Recognition
TINY
HARDWARE
Figure
2 shows a schematic of the system. An electret condenser
microphone is biased to 5 V via R4. The signal is then
amplified by U2a.
|

(Click
here to enlarge)
|
Figure
2—An electret condenser microphone (not shown) is
biased to 5 V via R4. The signal is then amplified
by U2a. C2 and R6 (along with C3 and R10) form a
high-pass filter. The output is fed to the second
op-amp, which is configured as a comparator whose
output is connected to PB4 of the 68HC705J1. The
EEPROM has a two-wire I2C interface, which is connected
to PB1 and PB0. The remaining pins of the processor
are connected to LEDs and push buttons. |
C2
and R6 (along with C3 and R10) form a high-pass filter,
with a cut-off frequency of 1600 Hz with an added zero
at 800 Hz. This setup provides a pre-emphasis function.
C1
serves as a mild antialiasing low-pass filter. The output
is fed to the second op-amp, which is configured as
a comparator with some hysteresis. R8 sets the threshold
of the comparator.
The
comparator’s output is a square wave that’s applied
to an input pin of the processor. The threshold defines
the beginning and end of a speech utterance. With no
signal present, the second op-amp’s output is at a DC
level.
Voice
pattern data is stored in a nonvolatile EEPROM. For
this project, I selected Ramtron’s FM24C04, which uses
ferroelectric cells.
It
has several advantages over a more generic part. For
one thing, the FRAM part can be written to over 10 billion
times, compared to about 10k cycles with a generic EEPROM.
This feature is important here because the first 128
bytes are used for scratch-pad memory and are constantly
written to.
Also,
it has a deep write buffer. So, once the starting address
is specified, memory address is autoincremented and
additional writes can be performed with no more intervention.
As a result, writing to the device is very fast.
Generic
parts, however, require you to set up the address every
other byte before you write data. This task creates
additional time overhead that may cause a bottleneck
in the software flow—a major concern in a real-time
system.
The
FM24C04 has a low standby current of 25 mA as well as
a low operation current of 100 mA. So, it’s well suited
for battery operation.
The
EEPROM’s first 128 bytes hold the transformed input
utterance to be recognized or trained. Locations 128–512
store the feature vectors of a previously trained utterance.
Each vector occupies 24 bytes, so the maximum number
of templates that can be stored is 16.
The
rest of the circuit comprises a 5-V regulator, switches,
and LEDs.