Penentuan Filterbank Wavelet Menggunakan Algoritma Mean Best Basis untuk Ekstraksi Ciri Sinyal Suara Ber-Noise

Penulis

Abdurahim Abdurahim, Syahroni Hidayat

Abstrak

Belakangan ini filterbank berbasis wavelet sebagai ekstraktor ciri mulai banyak dikembangkan untuk dapat menggantikan peran ciri Mel Frequency Cepstral Coefficient (MFCC) dalam sistem pengenalan suara otomatis. Salah satu filterbank ciri wavelet yang dikembangkan adalah Wavelet-Packet Cepstral Coefficient (WPCC). Namun sejauh ini pengembangannya hanya difokuskan untuk suara tanpa noise. Sehingga penelitian ini bertujuan untuk mendesain WPCC untuk suara yang mengandung noise. Algoritma Mean Best Basis (MBB) dan fungsi wavelet db44 dan db45 digunakan untuk memperoleh desain filterbank WPCC. Suara yang digunakan adalah rekaman suara vokal bahasa Indonesia a, i, u, e, é, o, dan ó yang mengandung noise. Hasil menunjukkan telah terbentuk dua buah desain filterbank WPCC. Masing-masing merupakan hasil penerapan fungsi daubechies db44 dan db45. Noise tidak memberikan pengaruh terhadap pembentukan kedua filterbank WPCC tersebut. Kedua bentuk filterbank telah memenuhi standar bentuk filter MFCC terutama untuk variabel range dan skala frekuensinya. Range frekuensinya berkisar antara 125 Hz - 1000 Hz dengan bentuk skala yang linier untuk frekuensi di bawah 1000 Hz. Sehingga dapat disimpulkan kedua bentuk filterbank WPCC ini dapat dipertimbangkan untuk digunakan sebagai ekstraktor ciri suara ber-noise.

 

Abstract

Recently wavelet-based filterbanks as feature start extractors have been widely developed to replace the role of the Mel Frequency Cepstral Coefficient (MFCC) feature in automatic speech recognition systems. One of the wavelet feature filterbanks developed is Wavelet-Packet Cepstral Coefficient (WPCC). But so far the development has only been focused on clean speech signal. So, the aim of this study is designing WPCC for a noisy speech signal. The Mean Best Basis (MBB) algorithm and db44 and db45 wavelet functions are applied to obtain the WPCC filterbank design. The noisy speech signal used is the recorded utterance Indonesian vowels a, i, u, e, é, o, and ó. The results show that two WPCC filterbank designs have been formed. Each of them is the result of applying the daubechies db44 and db45 functions. Noise has no effect on the establishment of both the WPCC filterbanks. Both fiterbank designs have met MFCC filter form standards, especially for its range of frequency and frequency scale. Its range of frequency is between 125 Hz - 1000 Hz with a linear scale for frequencies below 1000 Hz. Therefore it can be concluded that the two forms of WPCC filterbank can be considered to be used as a feature extractor for a noisy speech signal.


Teks Lengkap:

PDF

Referensi


ABO-ZAHHAD, M., AHMED, S.M. dan ABBAS, S.N., 2016. Biometrics from heart sounds: Evaluation of a new approach based on wavelet packet cepstral features using HSCT-11 database. Computers and Electrical Engineering, [online] 53, pp.346–358. Tersedia di: .

ADAM, T.B., SALAM, M.S. dan GUNAWAN, T.S., 2013. Wavelet Cesptral Coefficients for Isolated Speech Recognition. Telkomnika, 11(5), pp.2731–2738.

ANUSUYA, M.A. dan KATTI, S.K., 2011. Front end analysis of speech recognition: a review. International Journal of Speech Technology, [online] 14(2), pp.99–145. Tersedia di: [Diakses 11 May 2015].

ANUSUYA, M. dan KATTI, S., 2009. Speech recognition by machine: A review. International Journal of Computer Science and Information Security, [online] 6(3), pp.181–205. Tersedia di: .

CHOUEITER, G.F. dan GLASS, J.R., 2007. An implementation of rational wavelets and filter design for phonetic classification. IEEE Transactions on Audio, Speech and Language Processing, 15(3), pp.939–948.

COIFMAN, R.R. dan WICKERHAUSER, M.V., 1992. Entropy-Based Algorithms for Best Basis Selection. IEEE Transaction on Information Theory, 38(2), pp.713–718.

DESAI, N., DHAMELIYA, P.K. dan DESAI, P.V., 2013. Feature Extraction and Classification Techniques for Speech Recognition : A Review. International Journal of Emerging Technology and Advanced Engineering, 3(12), pp.367–371.

ELLIS, D.P.W., 2000. ICSI Speech FAQ: 4.1 How is the SNR of a speech example defined? [online] ICSI Berkeley. Tersedia di: [diakses 10 Oct. 2018].

FAROOQ, O. dan DATTA, S., 2001. Mel Filter-Like Admissible Wavelet Packet Structure for Speech Recognition. 8(7), pp.196–198.

FAROOQ, O. dan DATTA, S., 2003. Phoneme recognition using wavelet based features. In: Information Sciences. pp.5–15.

GAIKWAD, S.K., GAWALI, B.W. dan YANNAWAR, P., 2010. A Review on Speech Recognition Technique. International Journal of Computer Applications, 10(3), pp.16–24.

GALKA, J. dan ZIOLKO, M., 2009. Mean Best Basis Algorithm for Wavelet Speech Parameterization. In: Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing. pp.1110–1113.

HIDAYAT, S., HIDAYAT, R. dan ADJI, T.B., 2015. Speech Recognition of KV-Patterned Indonesian Syllable using MFCC, Wavelet and HMM. Jurnal Ilmiah Kursor, 8(2), pp.67–78.

HIDAYAT, S., NEGARA, H.R.P. dan KUMORO, D.T., 2017. Determination of the Optimum Wavelet Basis Function for Indonesian Vowel Voice Recognition. Jurnal Elektronika dan Telekomunikasi (JET), [online] 17(2), pp.42–47. Tersedia di: [Diakses 16 Mar. 2018].

HUANG, X., ACERO, A. dan HON, H.-W., 2001. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. [online] Processing. Prentice Hall. Tersedia di: .

MCLOUGHLIN, I., 2009. Applied Speech And Audio Processing : With Matlab Examples. 1st ed. [online] United Kingdom: Cambridge University Press. Tersedia di: .

PAVEZ, E. dan SILVA, J.F., 2012. Analysis and design of Wavelet-Packet Cepstral coefficients for automatic speech recognition. Speech Communication, [online] 54(6), pp.814–835. Tersedia di: .

RASHMI, C.R., 2014. Review of Algorithms and Applications in Speech Recognition System. International Journal of Computer Science and Information Technologies (IJCSIT), 5(4), pp.5258–5262.

RIOUL, O. dan VETTERLI, M., 1991. Wavelets and Signal Processing. IEEE SP Magazine. Oct.

TURNER, C. dan JOSEPH, A., 2015. A Wavelet Packet and Mel-Frequency Cepstral Coefficients-Based Feature Extraction Method for Speaker Identification. Procedia Computer Science, [online] 61, pp.416–421. Tersedia di: .