Sound Events Separation and Recognition using L1-Sparse Complex Nonnegative Matrix Factorization and Multi-Class Mean Supervector Support Vector Machine

Phetcharat Parathai, Naruephorn Tengtrairat, Wai Lok Woo

Research output: Contribution to conferencePaperpeer-review


This paper proposes a novel single channel sound separation and events recognition method. First, the sound separation step is based on a complex nonnegative matrix factorization (CMF) with probabilistically optimal L1 sparsity which decomposes an information-bearing matrix into twodimensional convolution of factor matrices that represent the spectral basis and temporal code of the sources. The L1 sparsity CMF method can extract recurrent patterns of magnitude spectra that underlie observed complex spectra and the phase estimates of constituent signals, thus enabling the features of the components to be extracted more efficiently. Second, the event recognition step is built by using the multi-class mean supervector support vector (MS-SVM) machine. The separated signal from the first step is segmented by using the sliding window function and then extract features of each block. The major features which are zero-crossing rate, Mel frequency cepstral coefficients, and short-time energy are investigated to classify sound events signal into defined classes. The mean supervector is encoded from the obtained features. The multi-class MS-SVM method has been examined the recognition performance by modeling with various features. The experimental results show the robustness and efficiency of the proposed method.
Original languageEnglish
Number of pages6
Publication statusPublished - 2017
EventThe 2nd International Conference on Information Technology - Mahidol University, Nakhon Pathom, Thailand
Duration: 2 Nov 20173 Nov 2017


ConferenceThe 2nd International Conference on Information Technology
Abbreviated titleINCIT 2017
CityNakhon Pathom
Internet address

Cite this