Immersive Audio Coding for Virtual Reality Using a Metadata-assisted Extension of the 3GPP EVS Codec
PubDate: April 2019
Teams: Dolby Australia Pty
Writers: D. McGrath; S. Bruhn; H. Purnhagen; M. Eckert; J. Torres; S. Brown; D. Darcy
Virtual Reality (VR) audio scenes may be composed of a very large number of audio elements, including dynamic audio objects, fixed audio channels and scene-based audio elements such as Higher Order Ambisonics (HOA). Potentially, the subjective listening experience may be replicated using a compact spatial format with a set number of dynamic objects and scene-based elements, retaining only the perceptual essence of the audio scene. The compact format would further enable a reduction in the complexity of subsequent compression and rendering. This paper investigates these hypotheses by exploring the use of a compact format that consists of up to four dynamic objects and nine HOA channels, with the Enhanced Voice Services (EVS) codec being applied to a 4-channel down-mix of the compact format.