circuitcellar.com
Magazine Support   Digital Library   Products & Services   Suppliers Directory 
 
 





 

Issue 133 August 2001
Listening Chips


by Tom Cantrell

Start In The Realm Of The SensoryLip Reader Walk The TalkSoft Sounds Yak AttackHearing AidSources & PDF

SOFT SOUNDS

The ’364 strategy is to deliver a credible solution for sub-QWERTY applications at a practical price. Part of the equation is the new Voice Extreme IDE (see Photo 3), the long-awaited upgrade to the earlier DOS-based tools. It provides a modern and friendly environment for developing speech recognition applications.

(click here to enlarge)

Photo 3—Although lacking much in the way of debug capability at this time, the Voice Extreme IDE is a big improvement over the earlier generation DOS tools.

Another intriguing aspect of VE is the proprietary C-like programming language included in the kit.

The bad news is that the VE version of C is by no means standard, with quite a few differences from standard C. The good news is that, as a practical matter, you won’t be porting gobs of existing code over to the ’364.

Remember, the chip needs to devote a lot of attention to the recognition task in order to deliver the best possible results. Also, although it seems like a lot, even the 2-MB external flash memory chip can easily get overrun with templates, weights, music, and recordings, not to mention your program.

So, do not try to port an RTOS, run a web server, or stuff megabytes of existing application code down the ’364’s throat. Variables are allocated statically, so feed VE C some recursive code and you’re in for quite a debugging session. Sure, the ’364 can handle some simple tasks on the side, but at the same time I suspect that it’s all too easy for the chip to bite off more than it can chew.

If you’re in a hurry (and who isn’t these days?) and aren’t building a zillion units, it’s probably better to use the ’364 module (see Photo 4) as a recognition coprocessor working in conjunction with another controller. This is especially true for those who are retro-fitting voice recognition features onto an existing product (i.e., most first-time Sensory customers). Use the twin 8-bit parallel ports or software RS-232 to establish an unobtrusive link with the host and have at it.

Photo 4—An alternative to starting from scratch, the Voice Extreme module combines the ’364 chip with flash memory and analog front-end components.

When it comes to writing actual voice recognition code with VE C, I think the VE features more than make up for weaknesses on the C side. The language has a full quiver of built-in voice processing routines and special functions that deal with the ’364 on-chip hardware. For example, in addition to the usual INTs and CHARs, VE C knows about data types like templates, weights, notes, tunes, and speech.

The ’364 makes hardware design a snap, as you can see in Figure 2. And, thanks to the VE C add-ons, recognition programs are easy to write. However, debugging them is a bit complicated currently. Although the ’364 has on-chip debug hardware (monitor, breakpoints, and such), the beta version of the software I received didn’t take advantage of the hardware. That means reverting to the old days of inserting print statements. In a whimsical twist on the scheme, the VE C DEBUG statement has options that either spell (i.e., RS-232 output) or speak to you.

Figure 2—Thanks to the built-in microphone preamplifier, PWM speaker driver and direct memory connection, upgrading your hardware design with voice recognition is easy.