CURRENT ISSUE

Contests

bottom corner

Feature Article



Issue #207 October 2007
Embedded Speech
Speech Synthesis for Small Applications

by Nicusor Birsan & Ionut Tarsa

Start |Embedding Speech | Speech Synthesis Techniques | Open-Source Project | System Building Blocks | Sound From Luminary Micro | First Big Porting Problem | Synthesizer | Translator | The LM3S811 Speaks | More Speech Applications | Sources & PDF

THE LM3S811 SPEAKS

After all the pieces were written and tested on the desktop, it was time to move the application to the LM3S811. It was easy to do this by taking the skeleton of play_phoneme with few minor modifications and adding mcu_translate.c.

For demonstration purposes, a short piece of text was copied from the Design Stellaris Contest announcement web page as a constant string into the main source of the project: “Circuit Cellar and Luminary Micro are pleased to offer design engineers an incredible contest opportunity called Design Stellaris 2006. And with Circuit Cellar magazine, they also have the #1 venue for peer recognition of their winning applications.”

After all the peripherals are initialized in the main routine, several cooperative tasks are called from within a while loop (see Listing 2).

Periodically, the text being spoken is displayed on an OLED using a pointer returned from a GetSourceGen() function. This feedback is achieved by including a pointer at the beginning of the word from which a current phoneme is translated into the phoneme list. Due to the software delays from the OLED library, that may cause blocking. The display functions are not called often.

The DSP part is called from the speak task in order to synthesize all of the phonemes from the list. If the list is empty, the remaining text is translated and a new list of phonemes is obtained for passing to the low-level synthesizer (see Listing 3). This is done until all the phonemes from the current text are synthesized.

Playing another phrase or sentence from application tasks can be done asynchronously by calling the SpeakLumi() function, which has only one parameter: a pointer to the text to be spoken (see Listing 4).

If the synthesizer is busy (p_cgen pointer is not NULL), the function returns a failure, so the task is informed that something else “is speaking” now. But it could preserve the state and speak a little later.

The complete application fit into 7,856 bytes of SRAM and 57,436 bytes of flash memory (43,032 going to tables). If more memory space is required, the dictionary may be reduced to just the words needed in the application. Also, the WAV generator buffer could be smaller, but this would result in some overhead because the sound driver must be called more frequently to fill it.

Repeating the same text stored in flash memory, our application may not appear to be too relevant. But this is enough to test the text-to-speech synthesizer’s quality. From now on, only your imagination is the limit. The Stellaris application can report anything with its “new voice” (alarms, port states, analog values, etc.)

Previous | Next

 


bottom corner