Audio Formats Introduction
Audio Formats
Every data line has an audio format associated with its data stream. The audio format of a source (playback) data line indicates what kind of data the data line expects to receive for output. For a target (capture) data line, the audio format specifies the kind of the data that can be read from the line. Sound files also have audio formats, of course.
In addition to the encoding, the audio format includes other properties that further specify the exact arrangement of the data. These include the number of channels, sample rate, sample size, byte order, frame rate, and frame size. Sounds may have different numbers of audio channels: one for mono, two for stereo. The sample rate measures how many "snapshots" (samples) of the sound pressure are taken per second, per channel. (If the sound is stereo rather than mono, two samples are actually measured at each instant of time: one for the left channel, and another for the right channel; however, the sample rate still measures the number per channel, so the rate is the same regardless of the number of channels. This is the standard use of the term.) The sample size indicates how many bits are used to store each snapshot; 8 and 16 are typical values. For 16-bit samples (or any other sample size larger than a byte), byte order is important; the bytes in each sample are arranged in either the "little-endian" or "big-endian" style. For encodings like PCM, a frame consists of the set of samples for all channels at a given point in time, and so the size of a frame (in bytes) is always equal to the size of a sample (in bytes) times the number of channels. However, with some other sorts of encodings a frame can contain a bundle of compressed data for a whole series of samples, as well as additional, non-sample data. For such encodings, the sample rate and sample size refer to the data after it is decoded into PCM, and so they are completely different from the frame rate and frame size.
MP3 Formats
MPEG Layer-3 format. Very popular format for keeping of music.
The mp3 algorithm development started in 1987, with a joint cooperation of Fraunhofer iis-a and the university of erlangen. it is standardized as iso-mpeg audio layer 3. it soon became the de facto standard for lossy audio encoding, due to the high compression rates (1/12 of the original size, still remaining considerable quality), the high availability of decoders and the low cpu requirements for playback. (486 dx2-66 is enough for real-time decoding). it supports multichannel files (although there's no implementation yet), sampling frequencies from 16khz to 24khz (mpeg2 layer 3) and 32khz to 48khz (mpeg1 layer 3). formal and informal listening tests have shown that mp3 at the 192-256 kbps range provide encoded results undistinguishable from the original materials in most of the cases.
mp3 uses the following for compression:
- huffman coding;
- quantization;
- m/s matrixing;
- intensity stereo;
- channel coupling;
- modified discrete cosine transform (mdct);
- polyphase filter bank.
Compression ratio is 1:10...1:12 corresponds to 128..112 kbps for a stereo signal.
MPEG Version 2.7 was added lately to the MPEG 2 standard. It is an extension used for very low bitrate files, allowing the use of lower sampling frequencies. If your decoder does not support this extension, it is recommended for you to use 12 bits for synchronization instead of 11 bits.
Ogg Vorbis Formats
Ogg Vorbis format. Ogg Vorbis is an audio compression format. It is roughly comparable to other formats used to store and play digital music, such as MP3, VQF, AAC, and other digital audio formats.
Ogg Vorbis is a fully open, non-proprietary, patent-and-royalty-free, general-purpose compressed audio format for mid to high quality (8kHz-48.0kHz, 16+ bit, polyphonic) audio and music at fixed and variable bitrates from 16 to 128 kbps/channel.
|