Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

编辑：映维 | 分类：XR | 2022年5月25日

Note: We don't have the ability to review paper

PubDate:

Teams: Ping An Technology

Writers: Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao

PDF: Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

Abstract

Metaverse is an interactive world that combines reality and virtuality, where participants can be virtual avatars. Anyone can hold a concert in a virtual concert hall, and users can quickly identify the real singer behind the virtual idol through the singer identification. Most singer identification methods are processed using the frame-level features. However, expect the singer’s timbre, the music frame includes music information, such as melodiousness, rhythm, and tonal. It means the music information is noise for using frame-level features to identify the singers. In this paper, instead of only the frame-level features, we propose to use another two features that address this problem. Middle-level feature, which represents the music’s melodiousness, rhythmic stability, and tonal stability, and is able to capture the perceptual features of music. The timbre feature, which is used in speaker identification, represents the singers’ voice features. Furthermore, we propose a convolutional recurrent neural network (CRNN) to combine three features for singer identification. The model firstly fuses the frame-level feature and timbre feature and then combines middle-level features to the mix features. In experiments, the proposed method achieves comparable performance on an average F1 score of 0.81 on the benchmark dataset of Artist20, which significantly improves related works.

本文链接：https://paper.nweon.com/12288

Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

您可能还喜欢...

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘

Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

您可能还喜欢...

Influence of Perspective on Dynamic Tasks in Virtual Reality

End-to-End Multi-View Structure-from-Motion with Hypercorrelation Volumes

NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

最新AR/VR行业分享

最新AR/VR专利

最新AR/VR行业招聘