[H-GEN] Re: converting from speex to ogg vorbis

Paul Gearon pag at tucanatech.com
Tue Sep 14 21:14:23 EDT 2004


On 15/09/2004, at 10:53 AM, Byron Ellacott wrote:

> Digital audio is a signal level sampled at a given frequency.  When 
> people talk about 44KHz audio, that refers to the frequency: 44,000 
> samples per second.  CD audio is 44.1KHz, telephones operate somewhere 
> around 8KHz, FM radio is 22KHz or so IIRC.

Not sure why you'd include FM radio here, as it is not digital (at 
least in this country).  I'm guessing that you were referring to the 
digital telephone system.  It's not really useful to compare analogue 
frequencies with digital sampling frequencies, though there are a few 
similarities.

> On top of that, there's the size of each sample.  Usually these days 
> that's 16 bits, though it can also be 8 bits or some weird 
> arrangement.

Audio DVDs are using 20 bit IIRC.

>  When doing signal processing, you usually expand it up to 32 or 64 
> bits, or even more, to reduce error accumulation.
>
> And finally, there's the number of audio channels being sampled.  Mono 
> data is one channel.  Stereo data is two channels, left and right.  
> You can get more complex than that, if you really want.  Dolby 5.1 
> refers to 5.1 channels: front left, front right, rear left, rear 
> right, bass and center.  The center is the point one.

No, the 5 "main" channels include the centre, the "low frequency effect 
channel" is the bass.

> I don't know how Dolby audio is encoded, though.

Like MP3, Dolby uses a modified discrete cosine transform (similar to a 
fourier transform) for frequency encoding, and includes psychoacoustic 
compression (eg. soft sounds within 20ms of a loud sound are not 
audible, and can be dropped).

> In raw PCM audio, which is essentially the WAV file minus its header, 
> you have all the audio data in a time based stream.  That means, for a 
> 16 bit stereo file, you have 16 bits for the left channel, 16 bits for 
> the right channel, 16 left, 16 right, etc.  In a 16 bit mono file, you 
> have 16 bits, 16 bits, 16 bits, ...  In other words, exactly half the 
> data of a 16 bit stereo file.
>
> One of the properties of audio is that if you decrease the period of a 
> signal (ie, bring peaks and troughs closer together, or increase the 
> decode/playback frequency) you raise the pitch.
>
> So, if you play a 22KHz, 16bit mono audio file as a 22KHz, 16bit 
> stereo file, you are playing the mono data at 44KHz, since you're 
> consuming the data at twice the expected rate.  Doubling the playback 
> frequency raises the pitch substantially, and gives you a chipmunk 
> sound.

Doh!  I wouldn't have thought of that.  :-)

Regards,
Paul Gearon

Software Engineer
Tucana Technologies
http://www.tucanatech.com

Catapultam habeo. Nisi pecuniam omnem mihi dabis, ad caput tuum saxum
immane mittam.
(Translation from latin: "I have a catapult. Give me all the money,
or I will fling an enormous rock at your head.")





More information about the General mailing list