空 挡 广 告 位 | 空 挡 广 告 位

Machine Learning-Based Room Classification for Selecting Binaural Room Impulse Responses in Augmented Reality Applications

Note: We don't have the ability to review paper

PubDate: November 2021

Teams: TH Köln

Writers: Damian Dziwis; Simon Zimmermann; Tim Lübeck; Johannes M. Arend; David Bau; Christoph Pörschmann

PDF: Machine Learning-Based Room Classification for Selecting Binaural Room Impulse Responses in Augmented Reality Applications

Abstract

A key attribute of augmented reality (AR) applications is the matching reverberation of virtual sounds to the room acoustics of the real environment. However, especially in real-time scenarios where the properties of rapidly changing surroundings are unknown, creating a persistently coherent sound field synthesis within a real space is a challenging problem. While AR devices and their sensors can usually provide depth information within the field of view of the user, retrieving a complete geometric model requires significant time and user activity. Prior acoustic measurements or scans of the deployment area also severely limit many use cases, especially in the consumer sector. In this paper, we propose an automatic system that provides a fast selection of room categories and their corresponding binaural reverberation using only monoscopic images as input information. The proposed system combines existing approaches of machine learning (ML) based room classification and parametric synthesis of binaural room impulse responses (BRIRs) to provide room reverberation for arbitrary indoor environments. As a proof of concept, we present a demonstrator developed in Cycling’74s Max linked to a python-based ML model. For the ML model, we use the convolutional neural network (CNN) GoogLeNet architecture trained on a subset of the Places365 data set. This subset contains 20 custom indoor room categories which are composed of the original categories that share similar acoustic properties. The demonstrator captures images and automatically selects binaural reverberation based on the predictions of the ML classifier. Monophonic stimuli are reverberated and presented using dynamic headphone-based binauralization.

您可能还喜欢...

Paper