[H-GEN] Re: converting from speex to ogg vorbis
Byron Ellacott
bje at apnic.net
Tue Sep 14 20:53:21 EDT 2004
Clinton Roy wrote:
> speexdec --stereo foo.spx foo.wav
> oggenc foo.wav
> Like the problem, the solution does'nt make any sense, enlightenment
> appreciated :)
Digital audio is a signal level sampled at a given frequency. When
people talk about 44KHz audio, that refers to the frequency: 44,000
samples per second. CD audio is 44.1KHz, telephones operate somewhere
around 8KHz, FM radio is 22KHz or so IIRC.
On top of that, there's the size of each sample. Usually these days
that's 16 bits, though it can also be 8 bits or some weird arrangement.
When doing signal processing, you usually expand it up to 32 or 64
bits, or even more, to reduce error accumulation.
And finally, there's the number of audio channels being sampled. Mono
data is one channel. Stereo data is two channels, left and right. You
can get more complex than that, if you really want. Dolby 5.1 refers to
5.1 channels: front left, front right, rear left, rear right, bass and
center. The center is the point one. I don't know how Dolby audio is
encoded, though.
In raw PCM audio, which is essentially the WAV file minus its header,
you have all the audio data in a time based stream. That means, for a
16 bit stereo file, you have 16 bits for the left channel, 16 bits for
the right channel, 16 left, 16 right, etc. In a 16 bit mono file, you
have 16 bits, 16 bits, 16 bits, ... In other words, exactly half the
data of a 16 bit stereo file.
One of the properties of audio is that if you decrease the period of a
signal (ie, bring peaks and troughs closer together, or increase the
decode/playback frequency) you raise the pitch.
So, if you play a 22KHz, 16bit mono audio file as a 22KHz, 16bit stereo
file, you are playing the mono data at 44KHz, since you're consuming the
data at twice the expected rate. Doubling the playback frequency raises
the pitch substantially, and gives you a chipmunk sound.
Ok?
--
bje
More information about the General
mailing list