MPEG Audio FAQ

MPEG-2: coded transmission/storage of sampled sound waves

Overview of MPEG-2

What does MPEG-2 audio standardise (in comparison to MPEG-1)?

The first phase, MPEG-1, was dealing with mono and two-channel stereo sound coding, at sampling frequencies commonly used for high quality audio (48, 44.1 and 32 kHz).
The second phase, MPEG-2, contained three different work items:

the extension of MPEG-1 to lower sampling frequencies (16 kHz, 22.05 kHz and 24 kHz), providing better sound quality at very low bit rates (below 64 kbit/s for a mono channel). This extension is easily added to a MPEG-1 audio decoder because it mainly implies inclusion of some more tables.
the backward-compatible extension of MPEG-1 to multichannel sound. MPEG-2 BC supports up to 5 full bandwidth channels plus one low frequency enhancement channel (such an ensemble of channels is referred to as '5.1'). This multichannel extension is both forward and backward compatible with MPEG-1. An MPEG-2 BC stream adheres to the structure of an MPEG-1 bitstream such that an MPEG-2 BC stream can be read and interpreted by an MPEG-1 audio decoder.
a new coding scheme called "Advanced Audio Coding" (AAC). An AAC bitstream is not backward compatible, i.e. cannot be read and interpreted by an MPEG-1 audio decoder.

Both MPEG-1 and the first two workitems of MPEG-2 have the three layer structure. The original MPEG-2 audio standard contained only the first workitems and was finalized in 1994. In order to improve coding efficiency for the 5-channel case, a non-backward-compatible audio coding scheme was defined (AAC) and finalized in 1997.

What are the applications of MPEG-2 audio (in comparison to MPEG-1)?

The main application area of MPEG-2 is digital television. It produces the video quality needed in HDTV. MPEG-2 audio supports the same applications as MPEG-1 (see the question on typical applications of MPEG-1), extending the MPEG-1 audio capabilities to applications that require very low bitrates and to applications that require more than two channels (e.g. for professional sound or for multi-lingual channels).

MPEG-2 language and abbreviations

What do the abbreviations used within MPEG-2 audio stand for?

AAC = Advanced Audio Coding
BC = Backward Compatible
LFE = Low Frequency Extension
NBC = Non Backward Compatible
LC = Low Complexity profile
SSR = Scaleable Sampling Rate profile
ADIF = Audio Data Interchange Format
ADTS = Audio Data Transport Stream

What is a profile in MPEG-2 AAC?

MPEG-2 AAC comes in 3 different "flavors", called profiles, which are coder configurations with different complexity and performance. They are derived from each other by different parametrization of certain algorithmic parts of the AAC coder.

What is MPEG-2.5?

MPEG-2.5 is the name of an extension to MPEG-1/2 Layer III which is proprietary to the Fraunhofer Institute for Integrated Circuits (FhG IIS), Germany. It enables coding at even lower sampling frequencies (8 kHz, 11.025 kHz, and 12 kHz).

Technicalities of MPEG-2

See also:
Technicalities of MPEG-1

What does the compatibility in MPEG-2 BC multichannel mean?

The core of the MPEG-2 bitstream is an MPEG-1 bitstream. This enables fully compatible decoding with an MPEG-1 decoder. In addition, the need to transfer two separate bitstreams, called simulcast (one for two-channel stereo and another one for the multichannel audio programme) is avoided, at some cost in coding efficiency for the multichannel audio signal, compared to AAC which is a Non Backward Compatible (NBC) coding algorithm.

How will an MPEG-1 decoder get information from all channels when receiving an MPEG-2 BC audio bitstream?

An MPEG-1 decoder will be supplied with an appropriate two-channel downmix of all the channels in the multichannel ensemble, contained in the MPEG-1 core of the MPEG-2 bitstream. The left and right channel of the downmix together contain components of all the channels, according to the equations in the compatibility matrix.

Do I have to use MPEG-1 audio with MPEG-1 Video and MPEG-2 Audio BC with MPEG-2 Video?

No, due to the compatibility, MPEG-2 BC audio can be used with MPEG-1 Video as well. The other way around, MPEG-1 Audio can be used with MPEG-2 Video without any restrictions. Any combinations of MPEG-1 and MPEG-2 BC Audio and Video can be handled by the system as specified by the MPEG-Systems standard ISO/IEC 11172-1 for MPEG-1 and ISO/IEC 13818-3 for MPEG-2.

I heard that a Second Edition of MPEG-2 BC has been approved. What are the reasons behind this revision?

While implementing the MPEG-2 Audio standard, as published in 1995, it was discovered that a certain combination of functionalities could not function properly. Although this combination was not considered to be of great practical importance, it was felt necessary to correct the standard in this respect. Since this necessitated a revision of the document, the opportunity was then taken to improve the standard in some other fields as well.

The technical changes in the Second Edition compared to the first publication of ISO/IEC 13818-3 (1995) are:

In the first publication, certain combinations of dynamic crosstalk and prediction were not prohibited but not practically implementable. In the Second Edition, these combinations are explicitly prohibited.
In the first publication, a low-pass filter was to be applied to the monophonic surround signal in matrix mode 2 (analogue surround mode). This filter is omitted in the Second Edition, greatly simplifying the decoder and improving coding efficiency.
The description of the syntax of the LFE channel was ambiguous. This description has been clarified.

In addition to these technical changes, many editorial changes have been made, improving readability and clarity. Also, an amendment concerning copyright registration has been incorporated in the standard.

What are the impacts of the technical changes in the revision to the Technical Report and Conformance documents?

There are no impacts on the Conformance document. There is only a minor impact on the Technical Report: one possible embodiment of a lowpass filter was implemented in the Technical Report. This filter has to be removed and the dematrix operations adapted. An amendment to the Technical Report was prepared.

What is MPEG-2 AAC?

The MPEG-2 AAC standard is a new, state of the art audio standard that provides very high audio quality at a rate of 64 kb/s/channel for multichannel operation. It provides a capability of up to 48 main audio channels, 16 low frequency effects channels, 16 overdub/multilingual channels, and 16 data streams. Up to 16 programs can be described, each consisting of any number of the audio and data elements. AAC adheres to the same basic coding paradigm as MPEG-1/2 Layer-3, but adds new coding tools and improves on details. Some of the improvements implemented by AAC are a filter bank with a higher frequency resolution, better entropy coding and better stereo coding. Two new coding tools are an optional backward prediction (used only in the Main Profile) and noise shaping in the time domain which mainly improves quality of encoded speech at low bit-rates. As a result, AAC is approximately 30% more bit rate efficient than MPEG-1 Layer 3.

What profiles are standardized for MPEG-2 AAC?

There are three profiles for the AAC standard, called Main Profile, Low Complexity Profile, and Scalable Sampling Rate Profile. The Main profile is intended for use when processing power, and especially memory, are not at a premium. The Low Complexity profile is intended for use when cycles and memory use are constrained, and the SSR profile when a scalable decoder is required. The Main and LC profiles have been tested at 320 kb/s for 5-channel audio programmes, and both have demonstrated better quality than competing audio coding algorithms running at 640 kb/s for the 5-channel program.

I heard that a Corrigendum of MPEG-2 AAC has been approved. What are the reasons behind this revision and what are the the technical changes ?

After issuing the basic MPEG-2 AAC standard in 1997, a so-called corrigendum was issued which addresses some minor glitches in the standards text without jeopardizing compatibility. The most important content of the corrigendum is the introduction of a transmission format for Dynamic Range Control (DRC) which allows the listener to adapt the dynamic range of the sound playback to the capabilities and restrictions of the reproduction system. Again, the transmission of the DRC data was defined in a way which is compatible with the original bitstream format.

Is the AAC transport format similar to MPEG-1?

Each layer in MPEG-1 standarizes its format for the bitstream containing the encoded sound data. All three have the same basic layout, consisting of a sequence of audio frames with a header and sound data. The frame rate is constant.
MPEG-2 AAC on the contrary leaves the choice of audio transport syntax to the application, standardizing only the format of the encoded audio data. In addition, two typical examples of transport syntax have also been standardized:

ADIF = Audio Data Interchange Format
The audio bitstream contains one single header with all information necessary to control the decoder such as the bitrate, the sampling frequency or the stereo mode. The main application of ADIF is exchange of audio files.
ADTS = Audio Data Transport Stream
The audio bitstream consists of a sequence of frames with headers similar to MPEG-1 audio frame headers. The encoded audio data of one frame is always contained between two sync words. The number of bits in a frame however can be variable.

Is stream splicing or "break-in" supported in MPEG-2 AAC?

When using ADIF as transport syntax for the MPEG-2 AAC stream, "break-in" - i.e. start of decoding at any bitposition - is not possible because there is no sync word to synchronize with the beginning of the audio frames. When using ADTS as transport syntax, break-in is enabled. The complexity of break-in support depends on the profile.
In MPEG-2 AAC Low Complexity and MPEG-2 AAC SSR modes, the prediction tools are not used, so break-in support is the same as that for MPEG-1 audio. For MPEG-2 AAC Main Profile, when prediction is enabled, break-ins are more tricky, as break-ins can only occur when there is a predictor reset across all frequency bands. This only happens in case of "attacks" when the bitstream switches from long to short windows, so the easiest way to break in a Main Profile bitstream is to start with a short block. For long windows the predictors are reset in a frequency-cyclic way, which may require up to 240 frames before all predictors are reset. So if you break in with long windows, some distortions might appear in the first few frames. The encoder can be set-up to reset the predictors more frequently which reduces the required number of frames needed before all predictors are reset.

Implementing MPEG-2 audio software

Is there any reference software for MPEG-2 AAC?

Yes. There is reference software for both an AAC example encoder and reference decoder.
The decoder source is complete and fully compliant and is capable of decoding all three AAC profiles: Main, Low Complexity and Scaleable Sampling Rate. It is a general multi-channel decoder capable of decoding up to 48 audio channels, 15 auxiliary low frequency enhancement channels and 15 data streams. Furthermore, it is quite efficient in that the compiled reference source coder decodes a stereo bitstream in real-time on a 100 MHz Pentinum.
The encoder software supports multi-channel encoding, implementing essentially all of the AAC coding tools. As usual, the encoder code is designed to demonstrate how to generate compliant AAC bitstreams rather than achieving optimum audio quality.

Where can I find the MPEG-2 AAC encoder reference software sources?

There is no public MPEG-2 AAC encoder reference source from MPEG. When purchasing the standard, you will get access to the MPEG reference software source for encoder and decoder. However, the encoder will not be optimized for quality or speed. To get state-of-the-art encoder source, you need to contact one of the companies which works on AAC.

Market position

How many MPEG-2 AAC Audio encoders and decoders are already in the market-place?

Since many of the services based on MPEG-2 AAC are currently in their introductory phase into the market, numbers are changing constantly. Main areas of application are

Internet Audio
Audio for digital television and radio (both AM and FM radio successors)
Portable playback devices

A current estimation amounts to several million decoders (both software and hardware-based) and a lower number of encoders.

What are the chances of AAC replacing MP3?

MP3 is the current choice for near-CD quality digital audio. However, AAC is its designated successor as it is able to provide the same sound quality with a larger compression rate. In addition it enables higher quality encoding and playback for high definition audio (at 96 kHz sampling rate). So AAC is the most promising candidate e.g. for new portable playback devices using solid state memory.

Relation to other standards / methods

Why should I use MPEG-2 AAC rather than Dolby AC-3?

AAC is a state-of-the-art audio compression algorithm that provides compression superior to that provided by older algorithms such as AC-3. AAC and AC-3 are both transform coders, but AAC uses a filterbank with a finer frequency resolution that enables superior signal compression. AAC also uses a number of new tools such as temporal noise shaping, backward adaptive linear prediction, joint stereo coding techniques and Huffman coding of quantized components, each of which provide additional audio compression capability. Furthermore, AAC is much more flexible than AC-3, in that AAC supports a wide range of sampling rates and bitrates, from one to 48 audio channels, up to 15 low frequency enhancement channels, multilanguage capability and up to 15 embedded data streams.

When should I use AAC rather than MPEG-2 BC?

Both provide 5-channel audio coding capability, however AAC provides a factor of two better audio compression relative to MPEG-2 BC, and is appropriate in all situations in which backward compatibility is not required or can be accomplished with simulcast. An MPEG-1 two channel decoder can decode an MPEG-2 BC 5-channel bitstream. AAC has no such backward compatibility requirement and, for 5-channel audio signals, has been shown in MPEG formal listening tests to provide slightly better audio quality at 320 kb/s than MPEG-2 BC can provide at 640 kb/s.

I have heard about Lucent Technologies' Perceptual Audio Coder (PAC). How does it compare to MPEG-2 AAC ?

AAC and PAC are similar audio coding technologies. However AAC has a number of new coding tools, such as Temporal Noise Shaping (TNS), that permits AAC to offer performance superior to that of PAC. This was shown in an independent and impartial test conducted by the Communications Research Centre (G. Soulodre, T. Grusec, M. Lavoie and L. Thibault, "Subjective Evaluation of State-of-the-Art 2-Channel Audio Codecs," Journal of the Audio Engineering Soc., Mar., 1998, pp. 164-177). This test showed that when coding stereo signals, the quality of AAC at 96 kb/s was comparable to the quality of PAC at 128 kb/s and that AAC at 128 kb/s was significantly better than PAC at 160 kb/s.
There is another test, conducted by Moulton Laboratories, that claims to compare PAC and AAC. However the system claimed to be AAC was not the same coding system tested by the Communications Research Centre, and did not use a state-of-the-art AAC encoder. Therefore, the results of this test do not indicate the actual performance of a commercial AAC system.

Bibliographic references

Can you propose more detailed information in the literature?

ISO/IEC 13818-3. "Information Technology: Generic coding of Moving pictures and associated audio - Audio Part". International Standard, 1994.
ISO/IEC 13818-7. "MPEG-2 advanced audio coding, AAC". International Standard, 1997.
M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, and Y. Oikawa: "ISO/IEC MPEG-2 Advanced Audio Coding". Proc. of the 101st AES-Convention, 1996.
M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, Y. Oikawa: "ISO/IEC MPEG-2 Advanced Audio Coding", Journal of the AES, Vol. 45, No. 10, October 1997, pp. 789-814
Karlheinz Brandenburg: "MP3 and AAC explained". Proc. of the AES 17th International Conference on High Quality Audio Coding, Florence, Italy, 1999.
G.A. Soulodre et al.: "Subjective Evaluation of State-of-the-Art Two-Channel Audio Codecs" J. Audio Enc. Soc., Vol. 46, No. 3, pp 164-177, March 1998.
etc.

Organisational details

What is the status of the standardisation process?

MPEG-2 was finalised in 1994 and resulted in the International Standard ISO/IEC 13818-3 which was published in 1995. MPEG-2 AAC was finalised in 1997 and published as the International Standard ISO/IEC 13818-7.

Where can I find information on MPEG-2 licensing?

Information on licensing of MPEG-2 Layer I and Layer II can be found at:

http://www.audiompeg.com/ (SISVEL, Italy)
http://www.licensing.philips.com/information/mpeg/

Information on licensing of MPEG-2 Layer III can be found at:

http://www.mp3licensing.com/
http://www.iis.fhg.de/amm/legal/index.html

Information on licensing of MPEG-2 AAC can be found at:

http://www.aac-audio.com/ (e-mail: AACLA@dolby.com)

The information above is provided for convenience of the reader. MPEG, however, is not in a position to guarantee the validity of any claim made by a party with respect to IPR ownership.

Heiko Purnhagen 07-Nov-2001