Peringkasan Dokumen Bahasa Indonesia Berbasis Non-Negative Matrix Factorization (NMF)
DOI:
https://doi.org/10.25126/jtiik.201411104Abstrak
Abstrak
Peningkatan teknologi informasi telah memicu peningkatan dokumen teks digital secara massif termasuk dokumen berbahasa Indonesia. Penggalian informasi dari dokumen berupa ringkasan secara otomatis sangat dibutuhkan. Pada penelitian ini peringkasan otomatis menggunakan Nonnegatif Matrix Factorization (NMF) telah dikembangkan. Sistem dievaluasi dengan membandingkan ringkasan sistem dengan ringkasan dari 3 orang pakar terhadap 100 dokumen bahasa Indonesia . Hasil evaluasi menunjukkan ringkasan sistem mempunyai rata-rata presisi dan recall masing-masing 0.19724 dan 0.34085. Sedangkan evaluasi ringkasan antar pakar mempunyai rata-rata presisi dan recall masing-masing 0.68667 dan 0.70642..
Kata kunci: peringkasan dokumen, NMF
Abstract
Improvement of information technology has led to increased massively digital text documents, including documents of Indonesian language. Extracting information from documents such as automatic summary is needed. In this study peringkasan automatically using non-negative Matrix Factorization (NMF) has been developed. The system was evaluated by comparing summary of system with summary of of three experts on 100 Indonesian documents. The evaluation shows summary of the system has an average precision and recall respectively 0.19724 and 0.34085. While the summary of an expert evaluation had an average precision and recall respectively 0.68667 and 0.70642.
Keywords: text summarization, NMF
Unduhan
Referensi
ACHMAD RIDOK ,TRI CAHYO ROMADHONA, 2013, Peringkas Dokument Otomaris Menggunakan Metode Fuzzy Model Sistem Inferensi Mamdani, Dalam Proceedings Seminar Nasional Teknologi Informasi dan Multimedia . - Yogyakarta : STIMIK AMIKOM, Vols. 1 07-19.
AMINI M. R., & GALLINARI, P., 2002, The use of unlabeled data to improve supervised learning for text summarization, In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrival (SIGIR’02) . - Tampere, Finland. : [s.n.], Vols. (pp. 105–112). .
BARZILAY R. and ELHADAD, M, 1997, Using Lexical Chains for Text Summarization. In Proceedings of the ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization, pages 10-17..
ERCAN G. and CICEKLI I, 2008, Lexical Cohesion based Topic Modeling for Summarization, InProceedings of 9th Int. Conf. Intelligent Text Processing and Computational Linguistics (CICLing-2008), pages 582-592.
ERKAN G. and D.R RADEV, 2004, Lexrank : Graph-based centrality as salience in text summarization. JAIR
FIRMIN T. and M.J. CHRZANOWSKI, 1999, An Evaluation of Automatic Text Summarization Systems, The MIT Press : Cambridge.
GONG Y., & LIU, X. 2001, Generic text summarization using relevance measure and latent semantic analysis, In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrival (SIGIR’01). - New Orleans, USA. Vols. (pp. 19–25).
HOVY E, 2003, Text Summarization, In Book The Oxford Handbook of Computational Linguistic, auth. Mitkov R.. Oxford: Oxford University Press.
HOVY E. and LIN, C-Y, 1999, Automated Text Summarization in SUMMARIST, In book Advances in Automatic Text Summarization. Maybury I. Mani and M.T. : The MIT Press, pages 81-94.
JU-HONG LEE SUN PARK, CHAN-MIN AHN , DAEHO KIM, 2009, Automatic generic document summarization based on non-negative, In Information Processing and Management 45. Elsevier Ltd, 20–34.
KAREL JEZEK and JOSEF STEINBERGER, 2008, Automatic Text Summarization (the state of the art 2007 and new challenges), Znalosti . - 2008, pp. 1-12.
LIN EDUARD HOVY and CHIN YEW, 1999, Automated text summarization in SUMMARIST, MIT Press, 1999, pages 81–94.
LUHN H.P, 1958, The Automatic Creation of Literature Abstracts, IBM Journal of Research Development.
MANI I. and M.T. MAYBURY, 1999, Advance in Automatic Text Summarization. Cambridge : The MIT, Press.
MIHALCEA R. and TARAU, P, 2004, Text-rank – bringing order into texts, In Proceeding of the Conference on Empirical Methods in Natural Language Processing.
QAZVINIAN V. and RADEV, D.R, 2008, Scientific paper summarization using citation summary networks.
TALA, FADILLAH Z. 2003. A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia. Master of Logic Project. Institute for Logic, Language and Computation. Universiteit van Amsterdam. The Netherlands.
ZHA H, 2002, Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering, In Proceedings of the 25th annual international ACM SIGIR conference on research and development in information retrival (SIGIR’02), Tampere, Finland. : (pp. 113–120).
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi

Artikel ini berlisensi Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Penulis yang menerbitkan di jurnal ini menyetujui ketentuan berikut:
- Penulis menyimpan hak cipta dan memberikan jurnal hak penerbitan pertama naskah secara simultan dengan lisensi di bawah Creative Common Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) yang mengizinkan orang lain untuk berbagi pekerjaan dengan sebuah pernyataan kepenulisan pekerjaan dan penerbitan awal di jurnal ini.
- Penulis bisa memasukkan ke dalam penyusunan kontraktual tambahan terpisah untuk distribusi non ekslusif versi kaya terbitan jurnal (contoh: mempostingnya ke repositori institusional atau menerbitkannya dalam sebuah buku), dengan pengakuan penerbitan awalnya di jurnal ini.
- Penulis diizinkan dan didorong untuk mem-posting karya mereka online (contoh: di repositori institusional atau di website mereka) sebelum dan selama proses penyerahan, karena dapat mengarahkan ke pertukaran produktif, seperti halnya sitiran yang lebih awal dan lebih hebat dari karya yang diterbitkan. (Lihat Efek Akses Terbuka).