Speech Emotion Recognition Using Attention Model

Singh, Jagjeet; Saheer, Lakshmi B.; Faust, Oliver

Speech Emotion Recognition Using Attention Model

journal contribution

posted on 2023-07-26, 16:10 authored by Jagjeet Singh, Lakshmi B. Saheer, Oliver Faust

Speech emotion recognition is an important research topic that can help to maintain and improve public health and contribute towards the ongoing progress of healthcare technology. There have been several advancements in the field of speech emotion recognition systems including the use of deep learning models and new acoustic and temporal features. This paper proposes a self-attention-based deep learning model that was created by combining a two-dimensional Convolutional Neural Network (CNN) and a long short-term memory (LSTM) network. This research builds on the existing literature to identify the best-performing features for this task with extensive experiments on different combinations of spectral and rhythmic information. Mel Frequency Cepstral Coefficients (MFCCs) emerged as the best performing features for this task. The experiments were performed on a customised dataset that was developed as a combination of RAVDESS, SAVEE, and TESS datasets. Eight states of emotions (happy, sad, angry, surprise, disgust, calm, fearful, and neutral) were detected. The proposed attention-based deep learning model achieved an average test accuracy rate of 90%, which is a substantial improvement over established models. Hence, this emotion detection model has the potential to improve automated mental health monitoring.

History

Refereed

Yes

Volume

20

Issue number

6

Publication title

International Journal of Environmental Research and Public Health

ISSN

1660-4601

External DOI

https://doi.org/10.3390/ijerph20065140

File version

Published version

Language

eng

Official URL

https://doi.org/10.3390/ijerph20065140

Legacy posted date

2023-03-17

Legacy creation date

2023-03-17

Legacy Faculty/School/Department

Faculty of Science & Engineering

Usage metrics

Keywords

speech emotion recognition self-attention models convolutional neural networks long short-term memory RAVDESS SAVEE TESS

Licence

CC BY 4.0

Speech Emotion Recognition Using Attention Model

History

Refereed

Volume

Issue number

Publication title

ISSN

External DOI

File version

Language

Official URL

Legacy posted date

Legacy creation date

Legacy Faculty/School/Department

Usage metrics

Categories

Keywords

Licence

Exports