Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification

M. T.S. Al-Kaltakchi, W. L. Woo, S. S. Dlay, J. A. Chambers

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Citations (Scopus)

Abstract

In this paper, a new combination of features and normalization methods is investigated for robust biometric speaker identification. Mel Frequency Cepstral Coefficients (MFCC) are efficient for speaker identification in clean speech while Power Normalized Cepstral Coefficients (PNCC) features are robust for noisy environments. Therefore, combining both features together is better than taking each one individually. In addition, Cepstral Mean and Variance Normalization (CMVN) and Feature Warping (FW) are used to mitigate possible channel effects and the handset mismatch in voice measurements. Speaker modelling is based on a Gaussian Mixture Model (GMM) with a universal background model (UBM). Coupled parameter learning between the speaker models and UBM is utilized to improve performance. Finally, maximum, mean and weighted sum fusions of model scores are used to enhance the Speaker Identification Accuracy (SIA). Verifications conducted on the TIMIT database with and without noise confirm performance improvement.

Original languageEnglish
Title of host publicationProceedings - 2016 4th International Workshop on Biometrics and Forensics, IWBF 2016
PublisherIEEE
ISBN (Electronic)9781467394482
DOIs
Publication statusPublished - 11 Apr 2016
Externally publishedYes
Event4th International Workshop on Biometrics and Forensics, IWBF 2016 - Limassol, Cyprus
Duration: 3 Mar 20164 Mar 2016

Conference

Conference4th International Workshop on Biometrics and Forensics, IWBF 2016
Country/TerritoryCyprus
CityLimassol
Period3/03/164/03/16

Keywords

  • Gaussian mixture model
  • Maximum a posterior probability adaptation
  • Robust biometric speaker identification and robust recognition
  • Score fusion
  • Universal background model

Fingerprint

Dive into the research topics of 'Study of fusion strategies and exploiting the combination of MFCC and PNCC features for robust biometric speaker identification'. Together they form a unique fingerprint.

Cite this