CURRENT ISSUE

Contests

bottom corner

Feature Article



Issue #202 May 2007
The Wittness Camera
Build a Self-Recording Surveillance Camera
Grand Prize Atmel AVR Design Contest 2006
by Alberto Ricci Bitti

Start | Solid-State Recording | Full Interaction | Complete Picture | Basic Instinct | Filled To Capacity | Speech Preparation | Circuit Implementation | Concept To Prototype | Picture Inspection |Design Evolution | Sources & PDF

SPEECH PREPARATION
The camera’s speech synthesizer plays a set of speech files to build its messages. You can record the files with your own voice if you want, but I was reluctant to do that, so I used computer-generated speech instead. Text-to-speech tools let you type in any text and listen for a warm, natural voice to read it.

AT&T Research Laboratories provided me with a fantastic on-line demo capable of producing a set of WAV files of my words. To reduce the amount of high frequencies, I selected a male voice (Mike). The camera’s dynamic range is 8 bits and its sample rate is 11,250 Hz, so I used a sound-editing program (Cool Edit Pro) for down sampling. To compensate for the reduced bit resolution, I compressed the voice waveforms using the dynamic compressor tool. I then normalized the amplitudes, and saved each speech segment in a separate file as 8-bit “unsigned” samples (i.e., raw numbers where silence corresponds to a value of 128).

At run time, the camera uses file names to select which files to play. It expects to find them under a common folder named “speech.” The sound playback technique is brutally tricky. The samples are banged straight to the PWM without buffering, with the only timing coming from a busy wait loop. Ironically, the delays are unpredictable due to disk access and a pleasant chorus-like effect, which makes the voice warmer and fuller.

The circuit amplifying the PWM signal is even more brutal, with just one transistor and trusting the speaker’s elastic properties to filter higher frequencies and move the cone back to the positive direction. These shortcuts limit the size of the speaker you can use. If it’s too small, the Nyquist products will be audible. If it’s too large, it won’t jump back in time for the next audio peak. I have experimented with various sizes from my junk box, and I found that 75 mm or 3" speakers work well—surprisingly well. It’s a clear sign that speech is like no other sound, because the brain is capable of reconstructing a voice from very little information.

 

Previous | Next

 


bottom corner