Visually Informed Binaural Audio Generation without Binaural Audios

编辑：映维 | 分类：CV / XR | 2021年6月24日

Note: We don't have the ability to review paper

PubDate: Apr 2021

Teams: 1CUHK - SenseTime Joint Lab, The Chinese University of Hong Kong
2S-Lab, Nanyang Technological University

Writers: Xudong Xu, Hang Zhou, Ziwei Liu, Bo Dai, Xiaogang Wang, Dahua Lin

PDF: Visually Informed Binaural Audio Generation without Binaural Audios

Visually Informed Binaural Audio Generation without Binaural Audios

Abstract

Stereophonic audio, especially binaural audio, plays an essential role in immersive viewing environments. Recent research has explored generating visually guided stereophonic audios supervised by multi-channel audio collections. However, due to the requirement of professional recording devices, existing datasets are limited in scale and variety, which impedes the generalization of supervised methods in real-world scenarios. In this work, we propose PseudoBinaural, an effective pipeline that is free of binaural recordings. The key insight is to carefully build pseudo visual-stereo pairs with mono data for training. Specifically, we leverage spherical harmonic decomposition and head-related impulse response (HRIR) to identify the relationship between spatial locations and received binaural audios. Then in the visual modality, corresponding visual cues of the mono data are manually placed at sound source positions to form the pairs. Compared to fully-supervised paradigms, our binaural-recording-free pipeline shows great stability in cross-dataset evaluation and achieves comparable performance under subjective preference. Moreover, combined with binaural recordings, our method is able to further boost the performance of binaural audio generation under supervised settings.

本文链接：https://paper.nweon.com/10308

Visually Informed Binaural Audio Generation without Binaural Audios

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Visually Informed Binaural Audio Generation without Binaural Audios

您可能还喜欢...

JeL: Connecting Through Breath in Virtual Reality

Six Degree-of-Freedom Haptic Simulation of a Stringed Musical Instrument for Triggering Sounds

Optimizations of the Spatial Decomposition Method for Binaural Reproduction

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘