MPEG-H 3D Audio, specified as ISO/IEC 23008-3 (MPEG-H Part 3), is an audio coding standard developed by the ISO/IEC Moving Picture Experts Group (MPEG) to support coding audio as audio channels, audio objects, or higher order ambisonics (HOA). MPEG-H 3D Audio can support up to 64 loudspeaker channels and 128 codec core channels.
Objects may be used alone or in combination with channels or HOA components. The use of audio objects allows for interactivity or personalization of a program by adjusting the gain or position of the objects during rendering in the MPEG-H decoder. Audio is encoded using an improved modified discrete cosine transform (MDCT) algorithm.[1]
Channels, objects, and HOA components may be used to transmit immersive sound as well as mono, stereo, or surround sound. The MPEG-H 3D Audio decoder renders the bitstream to a number of standard speaker configurations as well as to misplaced speakers. Binaural rendering of sound for headphone listening is also supported.
These are the ISO standards relating to MPEG-H 3D Audio:
ISO/IEC 23008-3:2022 - Part 3: 3D audio
ISO/IEC 23008-6:2021/Amd 1:2024 - Part 6: 3D audio reference software
ISO/IEC 23008-9:2023 - Part 9: 3D Audio conformance testing
In January 2013, the requirements were released for MPEG-H 3D Audio which was for an increase in the immersion of audio and to allow for a greater number of loudspeakers for audio localization.[2] The allowed audio types would be audio channels, audio objects, and HOA.[2]
On September 10, 2014, Fraunhofer IIS demonstrated a real time MPEG-H 3D audio encoder.[3]
In February 2015, MPEG announced that MPEG-H 3D Audio would be published as an International Standard.[4]
On March 10, 2015, the Advanced Television Systems Committee announced that MPEG-H 3D Audio was one of the three standards proposed for the audio system of ATSC 3.0.[5]
On April 10, 2015, Fraunhofer, Technicolor, and Qualcomm demonstrated a live broadcast signal chain consisting of all the elements needed to implement MPEG-H based audio in broadcast television. The demonstration featured a simulated remote truck at a sports event, a network control center, a local affiliate station, and a consumer living room. The audio was produced and encoded through an MPEG-H audio monitoring and authoring unit, mpeg-h real-time broadcast encoders, and real-time professional and consumer MPEG-H decoders. The audio was decoded in the consumer living room on a Technicolor set-top box.[6][7]
In April 2015, the Advanced Television Systems Committee announced that systems from Dolby Laboratories and the MPEG-H Audio Alliance (Fraunhofer, Technicolor, and Qualcomm) would be tested in the coming months for use as the audio layer for the ATSC 3.0 signal.[8]
In August 2015, the Advanced Television Systems Committee announced that systems from Dolby Laboratories and the MPEG-H Audio Alliance were demonstrated to the ATSC showing how they would work in both professional broadcast facilities and consumer home environments.[9][10]
On April 18, 2016, South Korean broadcast equipment manufacturers Kai Media and DS Broadcast announced the availability of MPEG-H 3D Audio in their latest 4K broadcast encoders.[11]
On May 2, 2016, the Advanced Television Systems Committee has elevated the A/342 audio standard for ATSC 3.0 to the status of a Candidate Standard. The MPEG-H Audio Alliance TV audio system and Dolby AC-4 are part of the A/342 standard.[12]
On June 24, 2016, the South Korean standardization organization "Telecommunications Technology Association" TTA published the standard for "Transmission and Reception of Terrestrial UHD TV Broadcasting Service" for the South Korean terrestrial UHD TV broadcasting service to be launched in February 2017. The TTA standard is based on ATSC 3.0 and specifies MPEG-H 3D Audio as the sole audio codec for the 4K TV system.[13][14][15]
On January 3, 2017, Fraunhofer IIS announced a trademark program to identify interoperable products that include MPEG-H.[16]
On January 8, 2019, Sony announced an immersive music service "360 Reality Audio" that uses MPEG-H.[17][18][19]
The Main profile of MPEG-H 3D Audio has five levels.[20]
Level | Maximum number of core channels |
Maximum number of loudspeaker channels |
---|---|---|
1 | 8 | 8 |
2 | 16 | 16 |
3 | 32 | 24 |
4 | 64 | 24 |
5 | 128 | 64 |
MPEG announced the availability of the MPEG-H 3D Audio Amendment 3 for late 2016. This amendment defines the Low Complexity Profile which includes technology that increases coding efficiency and also adds features designed for use in the broadcast industry.[21]