In this paper, a method is proposed to tackle the problem of single channel audio separation. The proposed method leverages on the exemplar source is used to emulate the targeted speech signal. A multicomponent nonnegative matrix factor 2D deconvolution (NMF2D) is proposed to model the temporal and spectral changes and the number of spectral basis of the audio signals. The paper proposes an artificial auxiliary channel to imitate a pair of stereo mixture signals, which is termed as “artificial-stereophonic mixtures.” The artificial-stereophonic mixtures and the exemplar source are jointly used to guide the factorization process of the NMF2D. The factorization is adapted under a hybrid framework that combines the generalized expectation–maximization algorithm with multiplicative update adaptation. The proposed algorithm leads to fast and stable convergence and ensures the nonnegativity constraints of the solution are satisfied. Adaptive sparsity has also been introduced on each sparse parameter in the multicomponent NMF2D model when the exemplar deviates from the target signal. Experimental results have shown the competence of the proposed algorithms in comparison with other algorithms.
|Number of pages||23|
|Journal||International Journal of Adaptive Control and Signal Processing|
|Early online date||17 Jul 2018|
|Publication status||Published - 13 Sept 2018|