CURRENT ISSUE

Contests

bottom corner

Feature Article



Issue #207 October 2007
Embedded Speech
Speech Synthesis for Small Applications

by Nicusor Birsan & Ionut Tarsa

Start | Embedding Speech | Speech Synthesis Techniques | Open-Source Project | System Building Blocks | Sound From Luminary Micro | First Big Porting Problem | Synthesizer | Translator | The LM3S811 Speaks | More Speech Applications | Sources & PDF

SYNTHESIZER

The synthesizer’s inputs consist of a list of phonemes and speech variations that must be synthesized into commands to the waveform generator. The commands are stored in a queue. Pointers to formants and wave-sound data are stored in the queue, as well as pitch and amplitude, variation, and a formants modifier.

For more natural speech, sound parameters are modified in short frames of about 3 to 8 ms, depending on the sample rate. Because of intonation and prosody rules, the formant parameters are not the same as data files, but each frame could be multiplied, copied multiple times with frequency modifications, and so on. One of the main reasons for moving code from C++ to C was to avoid complications and overhead due to dynamic allocation in small-sized SRAM. So, instead of using heap memory for allocating space for queues or frames, a fixed-length queue is declared and a simple frame allocator is written, which returns pointers from a fixed-sized buffer.

After all the source files were tested in Windows, a short application (play_phoneme) was easily developed in Keil µVision for synthesizing phonemes on the LM3S811. The input of a synthesizer consists of a list of phonemes generated by espeakdev in a source file named play_list.c. Even though this looks like a child’s game, we achieved an important task: we tested the DSP part of the synthesizer.

 

Previous | Next

 


bottom corner