Development of a Fast and High-Precision Audio Noise Reduction System to Enhance the Accuracy of Emotion Estimation in Practical Applications
Abstract
Speech-based emotion estimation has diverse applications, including mental health monitoring, human–computer interaction, and communication enhancement. The accurate estimation of emotions from speech is crucial in the detection of psychological stress, which is a growing concern in today’s high-stress societies. However, environmental noise significantly degrades estimation accuracy, and studies focusing on noise reduction specifically optimized for emotion estimation remain scarce. This study evaluated the impact of noise reduction on emotion estimation by comparing traditional signal processing (spectral subtraction, Wiener filtering) with deep learning-based methods (U-Net autoencoder, convolutional autoencoder). The effectiveness of each method is examined under continuous vehicle driving and transient clapping noise. The results indicate that traditional techniques effectively suppress continuous noise but struggle with transient noise, whereas AE-based methods, particularly U-Net autoencoder, significantly enhance the estimation accuracy in complex noise environments. This study underscores the importance of emotion-aware noise reduction and suggests that deep learning-based denoising techniques can significantly improve real-world applications. Future research will focus on further optimizing the AE architectures and integrating them into real-time systems.
Keywords: Speech Emotion Estimation, Noise Reduction, U-Net, Spectral Subtraction
DOI: 10.54941/ahfe1006058
Cite this paper
More from this volume
- Effects on player perception of jumping extensions with varying trajectories in VR
- Evaluation of Driver Overconfidence in Automotive Driving Using Physiological Data
- Estimation of Intellectual Productivity Using Electrocardiograms during Computational Tasks with Cognitive Load
- Real-Time Adaptive Gripping Mechanism Using Object Classification and Feedback Control
- AI-Driven Personalized Multisensory Design of Cultural Heritage: A Case Study of Kunqu Opera
- Exploring Cross-Sensory Perception in Dining Environments: The Role of Tactile Surface Properties on Users’ Visual and Gustatory Experiences
- Consideration of Visibility in the Kuiper Belt Focusing on the Placement of Objects
- Evaluation of UX using Biometric Emotion and Intensity Estimation Machine Learning Models
- Dynamic Balance Ability Estimation Method Using Plantar Pressure Measurement for Developing Shoes to Assess Daily Living Walking Ability
- Gamified Emotional Evaluation of Virtual Architectural Spaces:The G-SOR Framework and “Lost In Reverie”
- Construction of a PointNet-based Autoencoder Using a 3D Scene Dataset for Feature Extraction from Indoor Space Point Clouds Excluding Interior Details
- Investigating the Influence of Takeover Request Warning Methods on Driver Tension in Level 3 Automated Driving


AHFE Open Access