This paper proposes a novel single channel sound separation and events recognition method. First, the sound separation step is based on a complex nonnegative matrix factorization (CMF) with probabilistically optimal L1 sparsity which decomposes an information-bearing matrix into twodimensional convolution of factor matrices that represent the spectral basis and temporal code of the sources. The L1 sparsity CMF method can extract recurrent patterns of magnitude spectra that underlie observed complex spectra and the phase estimates of constituent signals, thus enabling the features of the components to be extracted more efficiently. Second, the event recognition step is built by using the multi-class mean supervector support vector (MS-SVM) machine. The separated signal from the first step is segmented by using the sliding window function and then extract features of each block. The major features which are zero-crossing rate, Mel frequency cepstral coefficients, and short-time energy are investigated to classify sound events signal into defined classes. The mean supervector is encoded from the obtained features. The multi-class MS-SVM method has been examined the recognition performance by modeling with various features. The experimental results show the robustness and efficiency of the proposed method.
|Number of pages||6|
|Publication status||Published - 2017|
|Event||The 2nd International Conference on Information Technology - Mahidol University, Nakhon Pathom, Thailand|
Duration: 2 Nov 2017 → 3 Nov 2017
|Conference||The 2nd International Conference on Information Technology|
|Abbreviated title||INCIT 2017|
|Period||2/11/17 → 3/11/17|