空 挡 广 告 位 | 空 挡 广 告 位

Neural Synthesis of Binaural Speech from Mono Audio

Note: We don't have the ability to review paper

PubDate: May 4, 2021

Teams: Facebook Reality Labs

Writers: Alexander Richard, Dejan Markovic, Israel D. Gebru, Steven Krenn, Gladstone Butler, Fernando De la Torre, Yaser Sheikh

PDF: Neural Synthesis of Binaural Speech from Mono Audio

Abstract

We present a neural rendering approach for binaural sound synthesis that can produce realistic and spatially accurate binaural sound in realtime. The network takes, as input, a single-channel audio source and synthesizes, as output, two-channel binaural sound, conditioned on the relative position and orientation of the listener with respect to the source. We investigate deficiencies of the ℓ2-loss on raw wave-forms in a theoretical analysis and introduce an improved loss that overcomes these limitations. In an empirical evaluation, we establish that our approach is the first to generate spatially accurate waveform outputs (as measured by real recordings) and outperforms existing approaches by a considerable margin, both quantitatively and in a perceptual study. Dataset and code are available online: https://github.com/facebookresearch/BinauralSpeechSynthesis.

您可能还喜欢...

Paper