Klasifikasi Teks Hadis Bukhari Terjemahan Indonesia Menggunakan Recurrent Convolutional Neural Network (CRNN)

Penulis

Muhammad Yuslan Abu Bakar, Adiwijaya Adiwijaya

Abstrak

Hadis merupakan sumber hukum dan pedoman kedua bagi umat Islam setelah Al-Qur’an dan banyak sekali hadis yang telah diriwayatkan oleh para ahli hadis selama ini. Penelitian ini membangun sebuah sistem yang dapat melakukan klasifikasi teks hadis Bukhari terjemahan berbahasa Indonesia. Topik ini diangkat untuk memenuhi kebutuhan umat Islam dalam mengetahui apa saja informasi mengenai anjuran dan larangan yang terdapat dalam suatu hadis. Klasifikasi teks memiliki tantangannya tersendiri terkait dengan jumlah fitur yang sangat banyak (dimensi sangat besar) sehingga waktu komputasi menjadi besar dan mengakibatkan sulitnya mendapatkan hasil yang optimal. Pada penelitian ini, digunakan salah satu metode hibrid dalam dunia deep learning dengan menggabungkan Convolutional Neural Network dan Recurrent Neural Network, yaitu Convolutional Recurrent Neural Network (CRNN). Convolutional Neural Network dipilih sebagai metode seleksi dan reduksi data dikarenakan dapat menangkap informasi spasial yang saling berhubungan dan berkorelasi. Sementara Recurrent Neural Network digunakan sebagai metode klasifikasi dengan mengusung kemampuan utamanya yaitu dapat menangkap informasi kontekstual yang sangat panjang khususnya pada data sekuens seperti data teks dengan mengandalkan ‘memori’ yang dimilikinya. Hasil penelitian menyajikan beberapa hasil klasifikasi menggunakan deep learning, dimana hasil akurasi terbaik diberikan oleh Convolutional Recurrent Neural Network (CRNN), yakni sebesar 80.79%.

 

Abstract

 

Hadith is a source of law and guidance for Muslims after the Qur'an and many hadith have been narrated by hadith experts so far. This research builds a system that can classify Bukhari hadith in Indonesian translations. This topic was raised to meet the needs of Muslims in knowing what information about the suggestions and prohibitions that exist in a hadith. Text classification has its own challenges related to several features whose dimensions are very large so that it increases computing time and causes difficulties in getting optimal results. This research uses a hybrid method in deep learning by combining a Convolutional Neural Network and a Recurrent Neural Network, namely Convolutional Recurrent Neural Network (CRNN). Convolutional Neural Network was chosen as a method of selecting and reducing data that can be determined as spatial information that is interrelated and correlated. While Recurrent Neural Networks are used as a classification method by carrying out capabilities that can be used as very long contextual information specifically on sequential data such as text data by relying on the ‘memory’ it has. This research presents several classification results using deep learning, where the best accuracy results are given by the Convolutional Recurrent Neural Network (CRNN), which is equal to 80.79%.


Teks Lengkap:

PDF

Referensi


ABDI, A., SHAMSUDDIN, S. M., HASAN, S., & PIRAN, J. 2019. Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion. Information Processing and Management, 56(4), 1245–1259. https://doi.org/10.1016/j.ipm.2019.02.018

ABU BAKAR, M. Y., ADIWIJAYA, & AL FARABY, S. 2019. Multi-Label Topic Classification of Hadith of Bukhari (Indonesian Language Translation)Using Information Gain and Backpropagation Neural Network. Proceedings of the 2018 International Conference on Asian Language Processing, IALP 2018, 344–350. https://doi.org/10.1109/IALP.2018.8629263

AGARAP, A. F. 2018. Deep Learning using Rectified Linear Units (ReLU). 1, 2–8. http://arxiv.org/abs/1803.08375

AL-ANZI, F. S., & ABUZEINA, D. 2017. Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic Indexing. Journal of King Saud University - Computer and Information Sciences, 29(2), 189–195. https://doi.org/10.1016/j.jksuci.2016.04.001

AL-KABI, M. N., GHASSAN, K., AL-SHALABI, R., AL-SINJILAWI, S. I., & AL-MUSTAFA, R. S. 2005. Al-Hadith Text Classifier. Journal of Applied Sciences, 5(3), 584–587. https://doi.org/10.3923/jas.2005.584.587

AL FARABY, S., JASIN, E. R. R., KUSUMANINGRUM, A., & ADIWIJAYA. 2018. Classification of hadith into positive suggestion, negative suggestion, and information. Journal of Physics: Conference Series, 971(1). https://doi.org/10.1088/1742-6596/971/1/012046

ALDHLAN, K. A., ZEKI, A. M., ZEKI, A. M., & ALRESHIDI, H. A. 2013. Novel mechanism to improve hadith classifier performance.

Proceedings - 2012 International Conference on Advanced Computer Science Applications and Technologies, ACSAT 2012, 512–517. https://doi.org/10.1109/ACSAT.2012.93

BURGOS-ARTIZZU, X. P., PERONA, P., & DOLLAR, P. 2013. Robust Face Landmark Estimation under Occlusion. 2013 IEEE International Conference on Computer Vision, 1513–1520. https://doi.org/10.1109/ICCV.2013.191

CHEN, B., HUANG, Q., CHEN, Y., CHENG, L., & CHEN, R. 2019. Deep Neural Networks for Multi-class Sentiment Classification. Proceedings - 20th International Conference on High Performance Computing and Communications, 16th International Conference on Smart City and 4th International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2018, 854–859.

https://doi.org/10.1109/HPCC/SmartCity/DSS.2018.00142

CHEN, G., YE, D., XING, Z., CHEN, J., & CAMBRIA, E. 2017. Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. Proceedings of the International Joint Conference on Neural Networks, 2017-May, 2377–2383. https://doi.org/10.1109/IJCNN.2017.7966144

CHOI, K., FAZEKAS, G., SANDLER, M., & CHO, K. 2017. Convolutional recurrent neural networks for music classification. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2392–2396. https://doi.org/10.1109/ICASSP.2017.7952585

DAELI, N. O. F., & ADIWIJAYA. 2020. Sentiment analysis on movie reviews using Information gain and K-nearest neighbor. JOURNAL OF DATA SCIENCE AND ITS APPLICATIONS, 3(1), 1–7. Https://Doi.Org/10.34818/JDSA.2020.3.22

DENG, J., SUN, Y., LIU, Q., & LU, H. 2015. Low rank driven robust facial landmark regression. Neurocomputing, 151(P1), 196–206. HTTPS://DOI.ORG/10.1016/J.NEUCOM.2014.09.052

FAUZAN, H., ADIWIJAYA, A., & AL-FARABY, S. 2018. Pengklasifikasian Topik Hadits Terjemahan Bahasa Indonesia Menggunakan Latent Semantic Indexing dan Support Vector Machine. Jurnal Media Informatika Budidarma, 2(4), 131. https://doi.org/10.30865/mib.v2i4.948

HARRAG, F., & EL-QAWASMAH, E. 2009. Neural network for Arabic text classification. 2nd International Conference on the Applications of Digital Information and Web Technologies, ICADIWT 2009, 778–783. https://doi.org/10.1109/ICADIWT.2009.5273841

HERLAMBANG, A. D., & WIJOYO, S. H. 2019. Algoritma Naive Bayes untuk Klasifikasi Sumber Belajar Berbasis Teks pada Mata Pelajaran Produktif di SMK Rumpun Teknologi Informasi dan Komunikasi. Jurnal Teknologi Informasi Dan Ilmu Komputer, 6(4), 430. https://doi.org/10.25126/jtiik.2019641323

HIDAYATI, D. C., AL FARABY, S., & ADIWIJAYA, A. 2020. Klasifikasi Topik Multi Label pada Hadis Shahih Bukhari Menggunakan K-Nearest Neighbor dan Latent Semantic Analysis. JURIKOM (Jurnal Riset Komputer), 7(1), 140.

https://doi.org/10.30865/jurikom.v7i1.2013

HMEIDI, I., AL-AYYOUB, M., ABDULLA, N. A., Almodawar, A. A., Abooraig, R., & Mahyoub, N. A. 2015. Automatic Arabic text categorization: A comprehensive comparative study. Journal of Information Science, 41(1), 114–124. https://doi.org/10.1177/0165551514558172

HU, W., HUANG, Y., WEI, L., ZHANG, F., & LI, H. 2015. Deep Convolutional Neural Networks for Hyperspectral Image Classification. Journal of Sensors, 2015, 1–12. https://doi.org/10.1155/2015/258619

IDE, H., & KURITA, T. 2017. Improvement of learning for CNN with ReLU activation by sparse regularization. Proceedings of the International Joint Conference on Neural Networks, 2017-May, 2684–2691. https://doi.org/10.1109/IJCNN.2017.7966185

JOHNSON, R., & ZHANG, T. 2015. Semi-supervised convolutional neural networks for text categorization via region embedding. Advances in Neural Information Processing Systems, 2015-Janua, 919–927.

KARLIK, B., & OLGAC, V. A. 2011. Performance Analysis of Various Activation Functions in Artificial Neural Networks. International Journal of Artificial Intelligence And Expert Systems (IJAE), 1(4), 111–122. https://doi.org/10.1088/1742-6596/1237/2/022030

KETKAR, N. 2017. Deep Learning with Python. In Deep Learning with Python. Apress. https://doi.org/10.1007/978-1-4842-2766-4

KRIZHEVSKY, A., SUTSKEVER, I., & HINTON, G. E. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386

LAI, S., XU, L., LIU, K., & ZHAO, J. 2015. Recurrent convolutional neural networks for text classification. Proceedings of the National Conference on Artificial Intelligence, 3, 2267–2273.

LECUN, Y., BENGIO, Y., & HINTON, G. 2015. Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539

LI, R., LIU, W., LIN, Y., ZHAO, H., & ZHANG, C. 2017. An Ensemble Multilabel Classification for Disease Risk Prediction. Journal of Healthcare Engineering, 2017. https://doi.org/10.1155/2017/8051673

LI, S., LI, W., COOK, C., ZHU, C., & GAO, Y. 2019. A fully trainable network with RNN-based pooling. Neurocomputing, 338(Shuai Li), 72–82. https://doi.org/10.1016/j.neucom.2019.02.004

LIANG, H., SUN, X., SUN, Y., & GAO, Y. 2017. Text feature extraction based on deep learning: a review. Eurasip Journal on Wireless Communications and Networking, 2017(1), 1–12. https://doi.org/10.1186/s13638-017-0993-1

LUO, Y. 2017. Recurrent neural networks for classifying relations in clinical notes. Journal of Biomedical Informatics, 72, 85–95. https://doi.org/10.1016/j.jbi.2017.07.006

MEDIAMER, G., ADIWIJAYA & FARABY, S. A. 2019. l.. Development of rule-based feature extraction in multi-label text classification. International Journal on Advanced Science, Engineering and Information Technology, 9(4), 1460–1465. https://doi.org/10.18517/ijaseit.9.4.8894

MUSLIM POPULATION BY COUNTRY 2020. (n.d.). Retrieved October 18, 2020, from https://worldpopulationreview.com/country-rankings/muslim-population-by-country

NWANKPA, C., IJOMAH, W., GACHAGAN, A., & MARSHALL, S. 2018. Activation Functions: Comparison of trends in Practice and Research for Deep Learning. 1–20. http://arxiv.org/abs/1811.03378

PINHEIRO, R. H. W., CAVALCANTI, G. D. C., & REN, T. I. 2015. Data-driven global-ranking local feature selection methods for text categorization. Expert Systems with Applications, 42(4), 1941–1949. https://doi.org/10.1016/j.eswa.2014.10.011

PRUSA, J. D., & KHOSHGOFTAAR, T. M. 2017. Improving deep neural network design with new text data representations. Journal of Big Data, 4(1). https://doi.org/10.1186/s40537-017-0065-8

PURBOLAKSONO, M. D., RESKYADITA, F. D., ADIWIJAYA, SURYANI, A. A., & HUDA, A. F. 2020. Indonesian text classification using back propagation and sastrawi stemming analysis with information gain for selection feature. International Journal on Advanced Science, Engineering and Information Technology, 10(1), 234–238. https://doi.org/10.18517/ijaseit.10.1.8858

PURNOMOPUTRA, R. B., & WISESTY, U. N. 2019. Sentiment Analysis of Movie Reviews using Naïve Bayes Method with Gini Index Feature Selection. Journal of Data Science and Its Applications, 2(July), 85–94. https://doi.org/10.34818/jdsa.2019.2.36

SHIMPIKAR, S., & GOVILKAR, S. 2017. A Survey of Text Summarization Techniques for Indian Regional Languages. International Journal of Computer Applications, 165(11), 29–33. https://doi.org/10.5120/ijca2017914083

SIGTIA, S., BENETOS, E., & DIXON, S. 2016. An end-to-end neural network for polyphonic piano music transcription. IEEE/ACM Transactions on Audio Speech and Language Processing, 24(5), 927–939. https://doi.org/10.1109/TASLP.2016.2533858

SIVARAM, M., PORKODI, V., MOHAMMED, A. S., & MANIKANDAN, V. 2019. Detection of accurate facial detection using hybrid deep convolutional recurrent neural network. 1844–1850. https://doi.org/10.21917/ijsc.2019.0256

TALATHI, S. S., & VARTAK, A. 2015. Improving performance of recurrent neural network with relu nonlinearity. http://arxiv.org/abs/1511.03771

WANG, B. 2018. Disconnected recurrent neural networks for text categorization. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), 1, 2311–2320. https://doi.org/10.18653/v1/p18-1215

YAMASHITA, R., NISHIO, M., DO, R. K. G., & TOGASHI, K. 2018.

Convolutional neural networks: an overview and application in radiology. Insights into Imaging, 9(4), 611–629. https://doi.org/10.1007/s13244-018-0639-9

ZHANG, J., SHAN, S., KAN, M., & CHEN, X. 2014. Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment (pp. 1–16). https://doi.org/10.1007/978-3-319-10605-2_1

ZHANG, X., ZHAO, J., & LECUN, Y. 2015. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems, 649–657. http://arxiv.org/abs/1502.01710

ZHANG, Z., LUO, P., LOY, C. C., & TANG, X. 2014. Facial Landmark Detection by Deep Multi-task Learning. Proceedings of International Conference on European Conference on Computer Vision, 94–108. https://doi.org/10.1007/978-3-319-10599-4_7

ZUO, Z., SHUAI, B., WANG, G., LIU, X., WANG, X., WANG, B., & CHEN, Y. 2015. Convolutional recurrent neural networks: Learning spatial dependencies for image representation. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2015-Octob, 18–26. https://doi.org/10.1109/CVPRW.2015.7301268




DOI: http://dx.doi.org/10.25126/jtiik.2021853750