INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11
CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC1/SC29/WG11 N2670
March 1999 / Seoul, Korea


MPEG-4 Audio Version 2
(Committee Draft 14496-3 AMD1)


This web page is intended to provide an overview of the "MPEG-4 Audio Version 2 Committee Draft 14496-3" and is based on excerpts from this document. The complete document is available by ftp.


Introduction

MPEG-4 version 2 is an amendment to MPEG-4 version 1. This document contains the description of bitstream and decoder extractions related to new tools defined within MPEG-4 version 2. As long as nothing else is mentioned, the description made in MPEG-4 version 1 is not changed but only extended.

Overview of MPEG-4 Audio Amd 1

ISO/IEC 14496-3 (MPEG-4 Audio) is a new kind of audio standard that integrates many different types of audio coding: natural sound with synthetic sound, low bitrate delivery with high-quality delivery, speech with music, complex soundtracks with simple ones, and traditional content with interactive and virtual-reality content. By standardizing individually sophisticated coding tools as well as a novel, flexible framework for audio synchronization, mixing, and downloaded post-production, the developers of the MPEG-4 Audio standard have created new technology for a new, interactive world of digital audio.

MPEG-4, unlike previous audio standards created by ISO/IEC and other groups, does not target a single application such as real-time telephony or high-quality audio compression. Rather, MPEG-4 Audio is a standard that applies to every application requiring the use of advanced sound compression, synthesis, manipulation, or playback. The subparts that follow specify the state-of-the-art coding tools in several domains; however, MPEG-4 Audio is more than just the sum of its parts. As the tools described here are integrated with the rest of the MPEG-4 standard, exciting new possibilities for object-based audio coding, interactive presentation, dynamic soundtracks, and other sorts of new media, are enabled.

Since a single set of tools is used to cover the needs of a broad range of applications, interoperability is a natural feature of systems that depend on the MPEG-4 Audio standard. A system that uses a particular coder - for example, a real-time voice communication system making use of the MPEG-4 speech coding toolset - can easily share data and development tools with other systems, even in different domains, that use the same tool - for example, a voicemail indexing and retrieval system making use of MPEG-4 speech coding. A multimedia terminal that can decode the Main Profile of MPEG-4 Audio has audio capabilities that cover the entire spectrum of audio functionality available today and in the future.

The remainder of this clause gives a more detailed overview of the capabilities and functioning of MPEG-4 Audio extended tools.

MPEG-4 Audio Amd 1 Capabilities

Error resilience tools for AAC

Several tools are provided to increase the error resilience for AAC. These tools increase the perceptual audio quality of the decoded audio signal in presence of noisy transmission channels.

Low delay

The low delay coding functionality provides the ability to extend the usage of generic low bitrate audio coding to applications requiring a very low delay of the encoding / decoding chain (e.g. full-duplex real-time communications).

Back channel

To allow for user on a remote side to dynamically control the streaming of the server, backchannel streams carrying user interaction information are defined.

Fine grain scalability

The BSAC tool provides fine grain scalability. This tool takes information from bitstream demultiplexer, parses that information, decodes the Arithmetic coded bit-sliced data, and reconstructs the quantized spectra and the scalefactors.

HILN : Harmonic and Individual Lines plus Noise (parametric audio coding)

MPEG-4 parametric audio coding uses the HILN technique (Harmonic and Individual Line plus Noise) to code non-speech signals like music at bit rates of 4 kbit/s and higher using a parametric representation of the audio signal. HILN allows independent change of speed and pitch during decoding. Furthermore HILN can be combined with MPEG-4 parametric speech coding (HVXC) to form an integrated parametric coder covering a wider range of signals and bit rates.

Error protection

The EP tool provides unequal error protection. This tool receives several classes of bits from the audio coding tools, and then apply the forward error correction codes (FEC) and/or cycric redundancy code (CRC) for each class, according to the error sensitivity of it.

Error resilient bitstream reordering

To be provided




Contents of N2670

Location Notes
w2670.pdf Cover page of ISO/IEC 14496-3 PDAM document.
w2670_n.pdf Normative part of ISO/IEC 14496-3 PDAM document.
w2670_i.pdf Informative part of ISO/IEC 14496-3 PDAM document.


(MPEG Audio Web Page) (Tree) (Up)

Heiko Purnhagen 23-Jul-1999