Real-Time Speech Enhancement Using Deep Generative Models

Authors

Ryo Suzuki Meiji University, Japan
Aya Tanaka Meiji University, Japan

Abstract

Speech enhancement is crucial in applications such as telecommunications, hearing aids, and automatic speech recognition, where background noise can significantly degrade performance. Traditional methods for speech enhancement often struggle to handle varying noise types and rapidly changing environments. This paper proposes a real-time speech enhancement framework using deep generative models, specifically employing Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Our approach leverages the ability of generative models to learn complex data distributions, effectively separating clean speech from noise. Experimental results demonstrate that the proposed method outperforms traditional approaches in both objective and subjective evaluations, offering superior noise reduction while preserving speech quality. The model's low latency makes it suitable for real-time applications, achieving significant improvements in speech intelligibility and quality in various noisy environments.

Downloads

Published

2023-12-20

How to Cite

Suzuki, R., & Tanaka, A. (2023). Real-Time Speech Enhancement Using Deep Generative Models. MZ Computing Journal, 4(2). Retrieved from http://mzresearch.com/index.php/MZCJ/article/view/311

Download Citation

Issue

Vol. 4 No. 2 (2023): Jul – Dec 2023

Section

Articles

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Real-Time Speech Enhancement Using Deep Generative Models

Authors

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Current Issue

Information

Make a Submission