Real-Time Speech Enhancement Using Deep Generative Models

Authors

  • Ryo Suzuki Meiji University, Japan
  • Aya Tanaka Meiji University, Japan

Abstract

Speech enhancement is crucial in applications such as telecommunications, hearing aids, and automatic speech recognition, where background noise can significantly degrade performance. Traditional methods for speech enhancement often struggle to handle varying noise types and rapidly changing environments. This paper proposes a real-time speech enhancement framework using deep generative models, specifically employing Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Our approach leverages the ability of generative models to learn complex data distributions, effectively separating clean speech from noise. Experimental results demonstrate that the proposed method outperforms traditional approaches in both objective and subjective evaluations, offering superior noise reduction while preserving speech quality. The model's low latency makes it suitable for real-time applications, achieving significant improvements in speech intelligibility and quality in various noisy environments.

Downloads

Published

2023-12-20

How to Cite

Suzuki, R., & Tanaka, A. (2023). Real-Time Speech Enhancement Using Deep Generative Models. MZ Computing Journal, 4(2). Retrieved from http://mzresearch.com/index.php/MZCJ/article/view/311