Speech Enhancement by CycleGAN Using Feature Map Regularization
Abstract
Highly promising speech enhancement results are recently obtained using an unsupervised CycleGAN approach, comparable to paired dataset neural network approach. However, very often, only a limited amount of noisy speech data is available. Therefore, a semi-supervised CycleGAN approach has been proposed, relying on augmented data samples. Another feature map regularized CycleGAN approach has also been proposed and applied in an image-style translation task, obtaining significant improvements on several standard databases. The feature map regularized CycleGAN approach is combined with the aforementioned semi-supervised learning approach and applied within a speech enhancement task. Significant improvements are obtained in terms of several standard measures using the proposed algorithm in comparison with the baseline algorithm as well as the augmented CycleGAN approach.
I (we), the author(s), hereby declare under full moral, financial and criminal liability that the manuscript submitted for publication to the Journal of Computer and Forensic Sciences
a) is the result of my (our) own original research and that I (we) hold the right to publish it;
b) does not infringe any copyright or other third-party proprietary rights;
c) complies with the Journal’s research and publishing ethics standards;
d) has not been published elsewhere, under this or any other title;
e) is not under consideration by another publication, under this or any other title.
I (we) also declare under full moral, financial and criminal liability:
f) that all conflicts of interest that may directly or potentially influence or impart bias on the work have been disclosed in the manuscript;
g) that if the article has been accepted for publishing I (we) will transfer all copyright ownership of the manuscript to the University of Criminal Investigation and Police Studies in Belgrade.
Signed by the Corresponding Author on behalf of the all other authors.