September
1999, Issue 110
Taking
Orders:
A
Speech Recognition Module
SPEAKER
DEPENDENCE
Speech
recognition is classified into two processing categoriesspeaker
dependent and speaker independent. Speaker-dependent
speech-recognition systems are trained by the person
who will be using the system. These systems achieve
a high command count and better than 99% accuracy for
word recognition.
One
drawback, however, is that the system responds accurately
only to the individual who trained the system. But,
an important advantage is that the circuit may be trained
in any language.
Actually,
language isnt even necessary. A series of grunts
and whistles (as long as they can be repeated accurately)
can be used in place of words. This is helpful to people
who, through accident or illness, have lost the ability
to verbalize words.
The
VoiceDirect module is speaker-dependent, which is the
most common approach employed in software for PCs. Sensory
also offers other chips for use in speaker-independent
modes.
A
speaker-independent system is trained to respond to
a word regardless of the speaker. This system must respond
accurately to a large variety of speech patterns, inflections,
and enunciations of each command word.
The
command-word count is typically much lower than the
speaker-dependent systems, but high accuracy can be
maintained when system demands are constrained by a
limited number of commands. Industrial applications
more often require speaker-independent voice recognition
systems, such as the systems used by AT&T and other
telephone companies.