Implementasi Metode Reccurrent Neural Network pada Text Summarization dengan Teknik Abstraktif

Penulis

Kasyfi Ivanedra, Metty Mustikasari

Abstrak

Text Summarization atau peringkas text merupakan salah satu penerapan Artificial Intelligence (AI) dimana komputer dapat meringkas text pada suatu kalimat atau artikel menjadi lebih sederhana dengan tujuan untuk mempermudah manusia dalam mengambil kesimpulan dari artikel yang panjang tanpa harus membaca secara keseluruhan. Peringkasan teks secara otomatis dengan menggunakan teknik Abstraktif memiliki kemampuan meringkas teks lebih natural sebagaimana manusia meringkas dibandingkan dengan teknik ekstraktif yang hanya menyusun kalimat berdasarkan frekuensi kemunculan kata. Untuk dapat menghasilkan sistem peringkas teks dengan metode abstraktif, membutuhkan metode Recurrent Neural Network (RNN) yang memiliki sistematika perhitungan bobot secara berulang. RNN merupakan bagian dari Deep Learning dimana nilai akurasi yang dihasilkan dapat lebih baik dibandingkan dengan jaringan saraf tiruan sederhana karena bobot yang dihitung akan lebih akurat mendekati persamaan setiap kata. Jenis RNN yang digunakan adalah LSTM (Long Short Term Memory) untuk menutupi kekurangan pada RNN yang tidak dapat menyimpan memori untuk dipilah dan menambahkan mekanisme Attention agar setiap kata dapat lebih fokus pada konteks. Penelitian ini menguji performa sistem menggunakan Precision, Recall, dan F-Measure dengan membandingan hasil ringkasan yang dihasilkan oleh sistem dan ringkasan yang dibuat oleh manusia. Dataset yang digunakan adalah data artikel berita dengan jumlah total artikel sebanyak 4515 buah artikel. Pengujian dibagi berdasarkan data dengan menggunakan Stemming dan dengan teknik Non-stemming. Nilai rata-rata recall artikel berita non-stemming adalah sebesar 41%, precision sebesar 81%, dan F-measure sebesar 54,27%. Sedangkan nilai rata-rata recall artikel berita dengan teknik stemming sebesar 44%, precision sebesar 88%, dan F-measure sebesar 58,20 %.

Abstract

Text Summarization is the application of Artificial Intelligence (AI) where the computer can summarize text of article to make it easier for humans to draw conclusions from long articles without having to read entirely. Abstractive techniques has ability to summarize the text more naturally as humans summarize. The summary results from abstractive techinques are more in context when compared to extractive techniques which only arrange sentences based on the frequency of occurrence of the word. To be able to produce a text summarization system with an abstractive techniques, it is required Deep Learning by using the Recurrent Neural Network (RNN) rather than simple Artificial Neural Network (ANN) method which has a systematic calculation of weight repeatedly in order to improve accuracy. The type of RNN used is LSTM (Long Short Term Memory) to cover the shortcomings of the RNN which cannot store memory to be sorted and add an Attention mechanism so that each word can focus more on the context.This study examines the performance of Precision, Recall, and F-Measure from the comparison of the summary results produced by the system and summaries made by humans. The dataset used is news article data with 4515 articles. Testing was divided based on data using Stemming and Non-stemming techniques. The average recall value of non-stemming news articles is 41%, precision is 81%, and F-measure is 54.27%. While the average value of recall of news articles with stemming technique is 44%, precision is 88%, and F-measure is 58.20%.


Kata Kunci


Text Summarization; RNN; LSTM; Abstraktif; Deep Learning; LSTM

Teks Lengkap:

PDF

Referensi


BAHDANAU, DZMITRY., CHO, KYUNGHYUN., & BENGIO, YOSHUA., 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR, abs/1409.0473. Conference ICLR.

C. SUNITHA., JAYA, DR.A., GANESH, AMAL., 2016, A study on Abstractive Summarization Techniques in Indian Language, Fourth International Conference on Recent Trends in Computer Science & Engineering, Elsevier B.V, pp.25-31.

DANIEL, GRAUPE, 2013, Principles of Artificial Neural Network (3rd Edition), Word Scientific, Singapore.

FAUZI, RAHMAT, 2016, Implementasi Jaringan Syaraf Tiruan dengan Metode Backpropagation terhadap Bibit Tanaman Karet. Jurnal Education and Development STKIP Tapanuli Selatan, pp. 1-11.

GOODFELLOW, IAN., BENGIO, YOSHUA., & COURVILLE, AARON., 2016, Deep Learning, MIT Press Cambridge, Inggris.

GREFF, KALUS., SRIVASTAVA, RUPESH K., KOUTNNIK, JAN., STEUNEBRINK, BAS R., SCHMIDDHUBER, JURGEN., 2017. LSTM: A Search Space Odyssey. Transactions on Neural Network and Learning System.

JOHNSTON, LINDSAY, 2013, Data Mining: Concepts, Methodologies, Tools, and Applications, Vol 1, Information Science Reference, USA.

KHAN, ATIF & NAOMIE SALIM, 2014. A Review on Abstractive Summarization Methods. Journal of Theoretical and Applied Information Techonlogy, pp.64-71.

NALLAPATI, RAMESH., ZHAI, FEIFEI., ZHOU, BOWEN, 2016.

SummaRuNNer: A Recurrent Neural Network Based Sequence Model for

Extractive Summarization of Documents. The Thirty-First AAAI Conference on Artificial Intelligence (AAAI-2017)

NALLAPATI, RAMESH., ZHOU, BOWEN., SANTOS, DOS CICERO., GULCEHRE, CAGLAR., & XIANG, BING., 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. The SIGNLL Conference on Computational Natural Language Learning (CoNLL)

PRABOWO, D.A., FHADLI, M., NAJIB, M.A., FAUZI, H.A., & CHOLISSODIN, IMAM., 2016. TF-IDF-Enchanted Genetic Algorihm untuk Extractive Automatic Text Summarization, Vol 3, No. 3, Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), pp.208-215.

RAMANUJAM, N & KALIAPPAN, M, 2016, An Automatic Multidocument Text Summarization Approach Based on Naïve Bayesian Classifier Using Timestamp Strategy, The Scientific World Journal, Hindawi Publishing, pp.1-10.

STEINBERG, J. & JEZEK, K., 2009, Evaluation Measures for Text Summarization, Computing and Informatics, vol. 28, no.2, pp. 251-275.

ZHANG, LIN, 2016, Theory, Methodology, Tools, and Applications for Modeling and Simulation of Complex Systems, Springer + Business Media, Singapore.




DOI: http://dx.doi.org/10.25126/jtiik.2019641067