New Feature Vectors using GFCC for Speaker Identification

A. Nagesh

Abstract


The feature vectors of speaker identification system plays a crucial role in the overall performance of the system. There are many new feature vectors extraction methods based on MFCC, but ultimately we want to maximize the performance of SID system.  The objective of this paper to derive Gammatone Frequency Cepstral Coefficients (GFCC) based a new set of feature vectors using Gaussian Mixer model (GMM) for speaker identification. The MFCC are the default feature vectors for speaker recognition, but they are not very robust at the presence of additive noise. The GFCC features in recent studies have shown very good robustness against noise and acoustic change. The main idea is  GFCC features based on GMM feature extraction is to improve the overall speaker identification performance in low signal to noise ratio (SNR) conditions.

Full Text:

PDF

References


Ambikairajah, E., 2007. Emerging features for speaker recognition. In: Proc. Sixth Internat. IEEE Conf. on Information, Communications & Signal Processing, Singapore, December 2007, pp. 1–7.

Burget, L., Matejka, P., Schwarz, P., Glembek, O., Cˇ ernocky´, J., 2007. Analysis of feature extraction and channel compensation in a GMM speaker recognition system. IEEE Trans. Audio, Speech Language Process. 15 (7), 1979–1986.

Carey, M., Parris, E., Lloyd-Thomas, H., Bennett, S., 1996. Robust prosodic features for speaker identification. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP 1996), Philadelphia, Pennsylvania, USA, 1996, pp. 1800–1803.

Chen, K., Wang, L., Chi, H., 1997. Methods of combining multiple classifiers with different features and their applications to textindependent speaker recognition. Internat. J. Pattern Recognition Artif. Intell. 11 (3), 417–445.

Espy-Wilson, C., Manocha, S., Vishnubhotla, S., 2006. A new set of features for text-independent speaker identification. In: Proc. Interspeech 2006 (ICSLP), Pittsburgh, Pennsylvania, USA, September 2006, pp. 1475–1478.

Ezzaidi, H., Rouat, J., O’Shaughnessy, D., 2001. Towards combining pitch and MFCC for speaker identification systems. In: Proc. Seventh European Conf. on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark, September 2001, pp. 2825–2828.

Faltlhauser, R., Ruske, G., 2001. Improving speaker recognition performance using phonetically structured gaussian mixture models. In: Proc. Seventh European Conf. on Speech Communication and Technology (Eurospeech 2001), Aalborg, Denmark, September 2001, pp. 751–754.

Hansen, E., Slyh, R., Anderson, T., 2004. Speaker recognition using phoneme-specific GMMs. In: Proc. Speaker Odyssey: the Speaker Recognition Workshop (Odyssey 2004), Toledo, Spain, May 2004, pp. 179–184.

Hatch, A., Stolcke, A., Peskin, B., 2005. Combining feature sets with support vector machines: application to speaker recognition. In: The 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), November 2005, pp. 75–79.

Heck, L., Konig, Y., So¨nmez, M., Weintraub, M., 2000. Robustness to telephone handset distortion in speaker recognition by discriminative feature design. Speech Comm. 31, 181–192.

Kinnunen, T., 2004. Spectral Features for Automatic Text-Independent Speaker Recognition. Licentiate’s Thesis, University of Joensuu, Department of Computer Science, Joensuu, Finland.

N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, "Front-End Factor Analysis For Speaker Verification," IEEE Trans. Audio, Speech and Language Processing, vol. 19, no. 4, pp. 788 - 798, May 2010.

E. B. Tazi, A. Benabbou, and M. Harti, “Efficient Text Independent Speaker Identification Based on GFCC and CMN Methods" ICMCS 2012, pp. 90-95.

He Xu, Lin Lin, Xiaoying Sun and Huanmei Jin, "A New Algorithm for Auditory Feature Extraction" CSNT-2012, pp. 229-232.

Feng song H and Xiao Cao, “An auditory Feature Extraction Method for Robust Speaker Recognition “ICCT 2012, pp. 1067-1071.

M. Slaney, “An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank”, Apple Technical Report No. 35, Advanced Technology Group, Apple Computer Inc., 1993.

Douglas A. Reynolds and Richard C. Rose “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models" IEEE Trans. Audio, Speech, and Language Processing, vol. 3(1), pp. 72-83,1995.




DOI: https://doi.org/10.23956/ijermt.v6i8.146

Refbacks

  • There are currently no refbacks.