Automated Detection of Human Emotions from Speech Using a Multi-Layer ANN Framework

Parbhakar Singh, Varsha Negi, Balaji Venkateswaran, B Murali Krishna, Ajay Sharma, Surendra Singh Chauhan, Sanjeev Kumar, Deepak Dagar

PDF

Published: Aug 4, 2025

Keywords:

SER, Natural Language Processing, Machine Learning. RAVDESS, EMODB

Parbhakar Singh, Varsha Negi, Balaji Venkateswaran, B Murali Krishna, Ajay Sharma, Surendra Singh Chauhan, Sanjeev Kumar, Deepak Dagar

Abstract

Speech Emotion Recognition (SER) is an essential area of research aimed at enabling machines to detect and interpret human emotions, thereby improving human-computer interaction. This study introduces a hybrid Speech Emotion Recognition system that combines machine learning techniques with Natural Language Processing (NLP) to enhance emotion detection accuracy and robustness. This research presents an automated framework for detecting human emotions from speech signals using a multi-layer Artificial Neural Network (ANN) model. The proposed system aims to accurately recognize emotional states such as happiness, sadness, anger, fear, and neutrality by analyzing vocal features like pitch, tone, energy, and spectral characteristics. The framework incorporates a feature extraction module using Mel-Frequency Cepstral Coefficients (MFCCs) and other prosodic features, followed by a multi-layer ANN trained on a benchmark emotional speech dataset. The model demonstrates improved accuracy and robustness in real-time emotion recognition tasks compared to traditional machine learning techniques. This study contributes to the advancement of human-computer interaction by enabling emotionally intelligent systems for applications in customer service, healthcare, virtual assistants, and e-learning environments.

Issue

Vol. 13 No. 8 (2024)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details