A Multi-Scale Feature Recalibration Network for End-to-End Single Channel Speech Enhancement

Xian Y.; Sun Y.; Wang W.; Naqvi SM.

A Multi-Scale Feature Recalibration Network for End-to-End Single Channel Speech Enhancement

Xian Y., Sun Y., Wang W., Naqvi SM.

© 2007-2012 IEEE. Deep neural networks based methods dominate recent development in single channel speech enhancement. In this paper, we propose a multi-scale feature recalibration convolutional encoder-decoder with bidirectional gated recurrent unit (BGRU) architecture for end-to-end speech enhancement. More specifically, multi-scale recalibration 2-D convolutional layers are used to extract local and contextual features from the signal. In addition, a gating mechanism is used in the recalibration network to control the information flow among the layers, which enables the scaled features to be weighted in order to retain speech and suppress noise. The fully connected layer (FC) is then employed to compress the output of the multi-scale 2-D convolutional layer with a small number of neurons, thus capturing the global information and improving parameter efficiency. The BGRU layers employ forward and backward GRUs, which contain the reset, update, and output gates, to exploit the interdependency among the past, current and future frames to improve predictions. The experimental results confirm that the proposed MCGN method outperforms several state-of-the-art methods.

Original publication

DOI

10.1109/JSTSP.2020.3045846

Type

Journal article

Journal

IEEE Journal on Selected Topics in Signal Processing

Publication Date

01/01/2021

Volume

Pages

143 - 155

Cookies on this website

A Multi-Scale Feature Recalibration Network for End-to-End Single Channel Speech Enhancement

Xian Y., Sun Y., Wang W., Naqvi SM.

DOI

Type

Journal

Publication Date

Volume

Pages