Robust Speech Emotion Recognition System Through Novel ER-CNN and Spectral Features

Note: We don't have the ability to review paper

PubDate: January 2022

Teams: University of Engineering and Technology

Writers: Muhammad Zeeshan; Huma Qayoom; Farman Hassan

PDF: Robust Speech Emotion Recognition System Through Novel ER-CNN and Spectral Features

Abstract

The speech is most fundamental way of communication among the humans and an important method for human computer interaction (HCI) by employing the microphone. Measurable emotion recognition from the speech signal by employing microphone is an emerging and interesting area of research in HCI such as human reboot interaction, healthcare, virtual reality, emergency call, and behavior assessment. In this paper, we proposed a novel integration of spectral features comprises of mel-spectral frequency coefficients (MFCC), root mean square energy (RMSE), and zero crossing rate (ZCR) to represent complex audio signal. For the classification purpose, we designed a novel convolutional neural network called emotion recognition neural network (ER-CNN) to classify different emotions such as angry, disgust, fear, happy, neutral, and sad. The proposed method Speech emotion recognition (SER-CNN) obtained an equal error rate (EER) of 1.34%, an accuracy of 94.99%, precision of 94.96%, recall of 94.98%, and F1-score of 94.96%. We evaluated the performance of the proposed system SER-CNN on the standard dataset crowd-sourced emotional multimodal actors (CREMA-D). Experimental results of the proposed method and comparative analysis against the existing methods show that our method has superior performance and can reliably be used for the emotion detection.

You may also like...

Paper