Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

In recent research, deep neural network (DNN) has been used to solve the monaural source separation problem. According to the training objectives, DNN-based monaural speech separation is categorized into three aspects, namely masking, mapping, and signal approximation based techniques. However, the performance of the traditional methods is not robust due to variations in real-world environments. Besides, in the vanilla DNN-based methods, the temporal information cannot be fully utilized. Therefore, in this paper, the long short-term memory (LSTM) neural network is applied to exploit the long-term speech contexts. Then, we propose the complex signal approximation (cSA), which is operated in the complex domain to utilize the phase information of the desired speech signal to improve the separation performance. The IEEE and the TIMIT corpora are used to generate mixtures with noise and speech interferences to evaluate the efficacy of the proposed method. The experimental results demonstrate the advantages of the proposed cSA-based LSTM recurrent neural network method in terms of different objective performance measures.

Original publication

DOI

10.1109/JSTSP.2019.2908760

Type

Journal article

Journal

IEEE Journal on Selected Topics in Signal Processing

Publication Date

01/05/2019

Volume

13

Pages

359 - 369