We have compared the performance of Multi-layer Perceptrons networks (MLP) and Radial Basis Function networks (RBF) in the task of speaker identification. The experiments are carried out on 400 utterances (10 digits, in English) from 10 speakers. LPC-derived Cepstrum Coefficients are used as the speaker specific features. The results show that the MLP networks are superior in memory usage and classification time. Nevertheless, they suffer from long training time and the classification performance is poorer than that of the RBF networks. The function centres of the RBF networks are either selected randomly from the training data or located by a K-mean algorithm. We find that K-mean clusteirng is an effective method in locating the function centres. We also find that by guaranteeing every speaker has similar number of function centres, the recognition performance can be improved further.
|Journal||Journal of Microcomputer Applications|
|Publication status||Published - Apr 1993|