February
1998, Issue 91
Low-Cost
Voice Recognition
TINY
USER INTERFACE
Before
discussing the voice-recognition software, I want to
describe the interface and how the system works from
the user’s point of view.
Seven
LEDs and four switches compose the Tiny Voice user interface.
LEDs D2, D3, D4, and D5 make up a four-bit binary number
that gives Tiny Voice’s status. It can either be the
index of a voice command or an error message.
When
power is connected or when the Reset switch is pressed,
the Stop mode is entered. Pressing a push button activates
the system and performs a certain function.
Pressing
Select displays a binary number from 0 to 15 on four
LEDs which selects the template number to be trained
or untrained. Each time Select is pressed, the number
increments to 15 and back to 0.
Pressing
Train starts the Training mode. The On LED is activated,
and the user is prompted to say the command to be trained.
While
the user is speaking, the Sampling LED is lit during
periods of speech and off during periods of silence.
If the training is successful, the template is stored
in EEPROM at the selected template location and the
system enters the Stop mode.
Untrain
modifies the data in the stored template so the pattern-matching
algorithm skips over this template and does not consider
it as a possible candidate.
This
is useful for context switching of vocabularies. For
example, out of the 16 templates, you may only need
to scan for two words (e.g., "yes" or "no"),
while ignoring the remaining 14.
To
enable a template that was previously untrained, press
the Train button and then press another button (e.g.,
Select) before speaking.
In
Recognition mode, the speech is sampled and analyzed.
The On LED is activated, and the user is prompted to
say a previously trained command. As before, the Sampling
LED is lit during speech and off during periods of silence.
The
input is compared to the templates in memory and a decision
made. If recognition is successful, the result is displayed
on the four LEDs in binary.
When
Reset is pressed, Stop mode is entered and the system
is ready to accept a push-button command. Previously
trained commands are not erased.
When
an error occurs, the Error LED (D1) is lit and the error
code is displayed in binary using the same four LEDs
that display the template index number. After ~2 s,
the LEDs go off and the system enters Stop mode.
The
error codes—Time Out, Buffer Full, and Not Recognized—are
defined in the header file.
After
Train or Recognize is pressed, the system waits for
valid speech input. If no input occurs after ~6.5 s,
the system enters the Stop condition and the Time Out
error code is displayed.
On
the other hand, if the length of the utterance is longer
than 1.6 s, the system enters the Stop mode and the
Buffer Full error is displayed.
The
Not Recognized error code is displayed if the input
utterance doesn’t match a stored template. The system
then enters Stop mode and waits for new input.