DIGITAL
MEDIA BASICS
Digital
media technology is constantly evolving. To
implement it, you need a good grasp of the terminology
used in the field. There are several useful
publications and web resources for broadening
your knowledge of digital media technology.
David Katz and Rick Gentile’s Embedded Media
Processing is one of my favorite resources.
In this section, I’ll briefly describe some
basic terms.
Sound
is a wave form of energy that propagates through
air or some other medium. Amplitude and frequency
are the two attributes that define sound. Music
is an art form, which involves structured and
audible sound. Although the definition is seemingly
subjective, it’s the most common. What’s music
to one person can be noise to another.
The
audio signal can come from a transducer (e.g.,
a microphone) that converts sound waves into
electrical signals, or it can be created (synthesized)
electronically. On the other end, an analog
audio signal coming to the speaker or an earphone
is converted into a pressure change (sound wave).
To
process sound, you need to convert both ways
between analog and digital signals. An ADC and
DAC do this. Devices combining one or more of
each are called codecs. Note that the word codec
is used for both the hardware devices and software
algorithms.
To
correctly represent the digital signal, refer
to the Nyquist theorem. (The sampling frequency
must be at least twice the highest frequency
of the signal.) You can hear sounds between
20 and 20,000 Hz, so the sampling rate has to
be at least 40 kHz. The commonly used sampling
rate for CD quality audio is 44.1 kHz. The amplitude
of sound is measured in decibels. The range
for a human ear is from 0 (the threshold of
hearing) to 120 dB SPL (the threshold of pain).
Dynamic
range is a ratio of the maximum signal level
to the minimum signal level (or the noise floor).
For digital audio systems, the dynamic range
depends on precision (number of bits representing
the signal). For 16-bit codecs, the dynamic
range is 96 dB.
There
are standards for connecting physical devices
in digital audio systems. The most common are
I2S and AC97. The former, which I used for my
project, is a three-wire serial interface for
the digital transmission of audio signals. It
has a bit clock, data, and left/right synchronization
lines. Intel created the popular AC97 standard
for PC audio.
The
popular audio file formats are wave and MP3.
The latter is one of the most popular lossy
audio compression codecs that can achieve a
compression ratio of up to 12:1. Lossy codecs
use a technique called perceptual encoding,
which takes advantage of your ears’ physiology.
Although the data is lost, the decoded sound
is close to the original.