Learning Audio-Visual Dereverberation

编辑：映维 | 分类：Perception / XR | 2022年6月25日

Note: We don't have the ability to review paper

PubDate: June 2022

Teams: UT Austin,Facebook AI Research

Writers: Changan Chen, Wei Sun, David Harwath, Kristen Grauman

PDF: Learning Audio-Visual Dereverberation

Project: Learning Audio-Visual Dereverberation

Learning Audio-Visual Dereverberation

Abstract

Reverberation from audio reflecting off surfaces and objects in the environment not only degrades the quality of speech for human perception, but also severely impacts the accuracy of automatic speech recognition. Prior work attempts to remove reverberation based on the audio modality only. Our idea is to learn to dereverberate speech from audio-visual observations. The visual environment surrounding a human speaker reveals important cues about the room geometry, materials, and speaker location, all of which influence the precise reverberation effects in the audio stream. We introduce Visually-Informed Dereverberation of Audio (VIDA), an end-to-end approach that learns to remove reverberation based on both the observed sounds and visual scene. In support of this new task, we develop a large-scale dataset that uses realistic acoustic renderings of speech in real-world 3D scans of homes offering a variety of room acoustics. Demonstrating our approach on both simulated and real imagery for speech enhancement, speech recognition, and speaker identification, we show it achieves state-of-the-art performance and substantially improves over traditional audio-only methods.

本文链接：https://paper.nweon.com/12426

Learning Audio-Visual Dereverberation

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Learning Audio-Visual Dereverberation

您可能还喜欢...

Rule-based Procedural Tree Modeling Approach

Sparse-to-Dense Multi-Encoder Shape Completion of Unstructured Point Cloud

Mixing Yarns and Triangles in Cloth Simulation

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘