Sistem Pengenalan Pembicara dengan Metode Wavelet-MCFF dan Pengklasifikasi Hidden Markov Models (HMM)

Penulis

Syahroni Hidayat, Andi Sofyan Anas, Siti Agrippina Alodia Yusuf, Muhammad Tajuddin

Abstrak

Penelitian pengolahan sinyal digital yang berfokus pada pengenalan pembicara telah dimulai sejak beberapa dekade yang lalu, dan telah menghasilkan banyak metode-metode pengenalan pembicara. Di antara algoritma pembentukan koefisien ciri yang telah dikembangkan tersebut, ada dua algoritma yang dapat memberikan akurasi yang tinggi jika diterapkan pada sistem, yaitu Mel Frequency Cepstral Coefficient (MFCC) dan Wavelet. Penelitian ini bertujuan untuk menguji dan memilih kanal terbaik dari proses wavelet-MFCC yang dapat dijadikan sebagai koefisien ciri baru untuk diterapkan pada sistem pengenal pembicara. Koefisien ciri baru tersebut kemudian disebut dengan koefisien ciri Wavelet-MFCC. Kofisien ini dibentuk dari merubah kanal hasil dekomposisi wavelet, yaitu kanal aproksimasi (cA), kanal detail (cD), dan penggabungannya (cAcD), menjadi koefisien MFCC. Metode dekomposisi wavelet yang digunakan adalah metode dyadic dengan menerapkan level dekomposisi level 1 dan level 2. Setiap koefisien ciri kemudian menjadi inputan pada sistem pengklasifikasi Hidden Markov Models (HMM). Keluaran dari HMM kemudian dihitung akurasinya dan dianalisis. Dari pengujian yang dilakukan, diperoleh bahwa kanal detail (cD) sebagai ciri dapat memberikan akurasi yang sama dengan menggunakan kanal gabungan (cAcD) dan lebih tinggi dari kanal aproksimasi (cA), dengan akurasi sebesar 95%. Hal ini menunjukkan bahwa, kanal detail pada dekomposisi level 1 menyimpan ciri suara dari setiap pembicara sehingga sudah cukup untuk dijadikan sebagai koefisien ciri. Maka, penggunaan dekomposisi level 1 dan kanal detail cD sebagai ciri Wavelet-MFCC pada sistem pengenalan pembicara dapat meringankan dan mempercepat proses komputasi.

 

Abstract

Research in digital signal that focused on speaker recognition has begun since decades ago, and has resulted many speaker recognition methods. there are two algorithms that can provide high accuracy in recognition system, which are Mel Frequency Cepstral Coefficient (MFCC) and Wavelet. the aims of this study is to examine and chose the best channel from wavelet-MFCC process that can be used as new feature coefficient, then called as Wavelet-MFCC features coefficient. The coefficient is built by converting the wavelet decomposition channels, which are approximation (cA), detail (cD), and its combination (cAcD), into the MFCC coefficient. Wavelet dyadic decomposition with level 1 and level 2 of decomposition is applied. Each feature coefficient acts as an input to the HMM classifier. The accuracy of the HMM output is calculated, then analyzed. The obtained results show that the detail chanel (cD) achieve equal accuracy as the combination chanel (cAcD), and higher accuracy compared to aproximation channel (cA), with accuracy 95%. Thus, it can be conclude that the detail channel on level 1 decomposition contains features of each speaker's. Then, cD is enough to be used as a Wavelet-MFCC feature. Thus, its implementation in the SRS can ease and speed up the computing process.


Teks Lengkap:

PDF

Referensi


ADAM, T.B., SALAM, M.S. & GUNAWAN, T.S., 2013. Wavelet Cesptral Coefficients For Isolated Speech Recognition. Telkomnika, 11(5), Pp.2731–2738.

AMELIA, F. & GUNAWAN, D., 2019. Dwt-Mfcc Method For Speaker Recognition System With Noise. 2019 7th International Conference On Smart Computing And Communications, Icscc 2019, Pp.1–5.

ASLIYAN, R., 2011. Syllable Based Speech Recognition. In: I. Ipsic, Ed. Speech Technologies. [Online] Intech.Pp.263–284. Available At: .

DARMAWAN, B. & ARIESSAPUTRA, S., 2018. Sistem Pengenalan Dan Verifikasi Pembicara Hmm. In: Citee. Pp.68–73.

GREENBERG, C.S., MASON, L.P., SADJADI, S.O. & REYNOLDS, D.A., 2020. Two Decades Of Speaker Recognition Evaluation At The National Institute Of Standards And Technology. Computer Speech And Language, [Online] 60, P.101032. Available At: .

HIDAYAT, R., PRIYATMADI & IKAWIJAYA, W., 2015. Wavelet Based Feature Extraction For The Vowel Sound. In: 2015 International Conference On Information Technology Systems And Innovation, Icitsi 2015 - Proceedings. Pp.1–4.

HIDAYAT, S., ABDURAHIM & TAJUDDIN, M., 2019. Evaluation And Design Of Wavelet Packet Cepstral Coefficient ( Wpcc ) For A Noisy Indonesian Vowels Signal. Journal Of Physics: Conference Series Paper, 1211(012023).

HIDAYAT, S., HIDAYAT, R. & ADJI, T.B., 2015. Speech Recognition Of Kv-Patterned Indonesian Syllable Using Mfcc, Wavelet And Hmm. Jurnal Ilmiah Kursor, 8(2), Pp.67–78.

HOSSAN, M.A., MEMON, S. & GREGORY, M.A., 2010. A Novel Approach For Mfcc Feature Extraction. 4th International Conference On Signal Processing And Communication Systems, Icspcs’2010 - Proceedings.

HUANG, X., ACERO, A., HON, H.-W. & REDDY, R., 2001. Spoken Language Processing: A Guide To Theory, Algorithm And System Development. United States: Prentice Hall Ptr.

JURAFSKY, D. & MARTIN, J.H., 2008. Speech And Language Processing: An Introduction To Natural Language Processing, Computational Linguistics, And Speech Recognition. 1st Ed. Prentice Hall.

MASON, J.S. & THOMPSON, J., 1993. Gender Effects In Speaker Recognition. In: Proc. Icsp-93. Pp.733–736.

SADJADI, S.O., GREENBERG, C., SINGER, E., REYNOLDS, D., MASON, L. & HERNANDEZ-CORDERO, J., 2020. The 2019 Nist Speaker Recognition Evaluation Cts Challenge. Pp.266–272.

SAWALHA, M. & ABUSHARIAH, M.A.M., 2013. The Effects Of Speakers ’ Gender , Age , And Region On Overall Performance Of Arabic Automatic Speech Recognition Systems Using The Phonetically Rich And Balanced Modern Standard Arabic Speech Corpus. In: Proceedings Of The 2nd Workshop Of Arabic Corpus Linguistics Wacl-2.

SHARMA, G., UMAPATHY, K. & KRISHNAN, S., 2020. Trends In Audio Signal Feature Extraction Methods. Applied Acoustics, [Online] 158, P.107020. Available At: .

SHIRALI-SHAHREZA, M.H. & SHIRALI-SHAHREZA, S., 2010. Effect Of Mfcc Normalization On Vector Quantization Based Speaker Identification. In: 2010 Ieee International Symposium On Signal Processing And Information Technology, Isspit 2010. Pp.250–253.

SUJIYA, S. & CHANDRA, E., 2017. A Review On Speaker Recognition. International Journal Of Engineering And Technology (Ijet), 9(3), Pp.1592–1598.

TODKAR, S.P., BABAR, S.S., AMBIKE, R.U., SURYAKAR, P.B. & PRASAD, J.R., 2018. Speaker Recognition Techniques: A Review. In: 2018 3rd International Conference For Convergence In Technology, I2ct 2018. Ieee.Pp.1–5.




DOI: http://dx.doi.org/10.25126/jtiik.0813284