Issue
133 August 2001
Listening Chips
Delving
into voice recognition and chips that listen, Tom takes
a look at the current state of development. With pioneer
Sensory leading the way, he discovers theres potential
for designing unique applications.
Start
In The Realm Of The Sensory
Lip Reader
Walk The Talk Soft
Sounds Yak Attack
Hearing Aid
Sources & PDF
Years back
in May of 1993, I wrote an article called "Talking
Chips" (Circuit Cellar 34) describing the then emerging
digital voice recorder ICs. Besides offering a high-tech
replacement for the bulky, balky mechanical voice recorders
of yore, the innovation spawned entirely novel applications,
such as greeting cards that speak your own personally
recorded message.
As you might
guess, this month Im covering a chip that can listen.
Theres definitely the potential for inspiring a
lot of exciting and unique applications, some of these
are more obvious than others.
I think voice
recognition technology has gotten a bad reputation because
its stereotyped as a magic bullet designed to supposedly
put that inspired hack of the typewriter age, the crusty
but lovable QWERTY keyboard, out of its misery. Through
the brute force application of MIPS and megahertz, progress
has been made, but chips and software cant yet achieve
the accuracy and speed required for transcribing natural
gab.
On reflection,
replacing keyboards may be one of those situations where
if it can be done, it will be done, and then youll
see if it should have been done. As someone who types
a lot, I have a few observations.
First, when
writing an article, typing is the least of my worries.
The real work is studying datasheets, fooling with boards,
trying experiments, and so forth. The hardest of all is
giving creative birth to the words I want to say, not
just typing them.
Even imagining
a perfect voice recognition system for my PC, Im
not convinced. Try this experiment. Think of a sentence
or phrase and then type it while saying it aloud. As someone
who can type at a decent rate, I can key in the words
at nearly a normal speaking cadence. Only by slurring
the words together in a blur does speaking demonstrate
more top-end throughput. The human brain demonstrates
its formidable skill by being able to parse such frenetic
blabber, but it drives automated recognition systems nuts.
Besides, have
you ever given a long speech or talked vociferously at
a party for hours on end? Its tiring. I presume
it wouldnt be long before folks would get up in
arms over the other CTS, carpal tonsil syndrome.
Overlooked in
the dubious quest to kill QWERTY is the fact that there
are less glamorous (but imminently practical) voice recognition
applications that do become feasible with incremental
advances in technology. Besides such likely candidates
as car phones, automated phone systems, and toys, I can
imagine a lot of handy (make that no hands) products.
For example,
when using a scope or logic analyzer, I invariably end
up needing to punch a switch or twist a dial even as both
hands are frozen probing the rats nest. It would
be great if I could just say, for example, "External
Trigger Channel 2" instead of the more flowery phrases
I find myself using in that situation.