Deteksi Kejadian Lalu Lintas pada Teks Twitter dengan Pendekatan Klasidikasi Multi-Label Berbasis Deep Learning

Penulis

  • Luthfi Atikah Institut Teknologi Sepuluh Nopember,Surabaya
  • Diana Purwitasari Institut Teknologi Sepuluh Nopember, Surabaya
  • Nanik Suciati Institut Teknologi Sepuluh Nopember, Surabaya

DOI:

https://doi.org/10.25126/jtiik.2022915206

Abstrak

Kemacetan merupakan salah satu kejadian yang sering terjadi di kota-kota besar. Hal ini dapat merugikan pengguna jalan, oleh karena itu perlu dilakukan pendeteksian kejadian lalu lintas. Saat ini, twitter digunakan sebagai sumber informasi untuk mendeteksi suatu kejadian. Namun, pengguna twitter cenderung membagikan beberapa informasi sekaligus, sehingga dalam satu tweet bisa memiliki lebih dari satu label. Pada penelitian ini dilakukan klasifikasi multi-label menggunakan 18.000 data dari akun twitter terverifikasi di Surabaya. Klasifikasi multi-label pada penelitian ini dilakukan untuk mengidentifikasi banyak situasi lalu lintas seperti kondisi cuaca, kecelakaan lalu lintas, kemacetan lalu lintas, lalu lintas padat, dan lalu lintas lancar. Klasifikasi dilakukan dengan menggunakan pendekatan deep learning (CNN dan LSTM) dan word embedding (word2vec dan fastText) dengan augmentasi dan non-augmentasi data. Eksperimen dilakukan dengan 3 skenario berbeda untuk melihat pengaruh data uji yang berbeda pada data latih yang sama. Selanjutnya dilakukan eksperimen untuk menguji pengaruh jumlah label terhadap klasifikasi multi-label pada data uji yang sama. Akurasi tertinggi pada non-augmentasi data adalah 0,75 dan pada augmentasi data adalah 0,95. Dari keseluruhan ujicoba akurasi tertinggi diperoleh dari kombinasi LSTM dan fastText.

 

Abstract

 Congestion is one of the events that often occurs in big cities. This can be detrimental to road users, therefore it is necessary to detect traffic events accurately and efficiently. Currently, Twitter is used as a source of information to detect an incident. However, twitter users tend to share several information at once, so that in one tweet can have more than one label. Therefore, multi-label classification is necessary. This study utilizes 18,000 data from verified twitter accounts in Surabaya. Multi-label classification is carried out to identify many traffic situations, such as weather conditions, events, traffic jams, heavy traffic, and smooth traffic. Classification is performed using deep learning approach (CNN and LSTM) and word embedding (word2vec and fastText) with augmented and non-augmented . Experiments are carried out with 3 different scenarios to see the effect of different uji data on the same train data. Furthermore, the experiments are conducted to examine the effect of the number of labels on the multi-label classification on the same test data. The highest accuracy on non-augmented data is 0,75 and on augmented data is 0,95. All of the highest accuracy is obtained from the combination of LSTM and fastText


Downloads

Download data is not yet available.

Biografi Penulis

  • Luthfi Atikah, Institut Teknologi Sepuluh Nopember,Surabaya

    Teknik Informatika, Fakultas Teknologi Elektro dan Informatika Cerdas, Institut Teknologi Sepuluh Nopember, Surabaya

    Lektor Kepala

Referensi

AIPE, A., MUKUNTHA, N. S., EKBAL, A., & KUROSHASI, S. 2018. Deep learning approach towards multi-label classification of crisis related tweets. Proceedings of the International ISCRAM Conference, 2018-May(May), 705–717.

ALOMARI, E., MEHMOOD, R., & KATIB, I. 2019. Road Traffic Event Detection Using Twitter Data , Machine Learning , and Apache Spark. https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00332

ATEFAH, F., & KHREICH, W. 2015. A survey of techniques for event detection in Twitter. Computational Intelligence, 31(1), 133–164. https://doi.org/10.1111/coin.12017

BDEIR, A. M., & IBRAHIM, F. 2020. A framework for arabic tweets multi-label classification using word embedding and neural networks algorithms. ACM International Conference Proceeding Series, 105–112. https://doi.org/10.1145/3404512.3404526

BOUTELL, M. R., LOU, J., SHEN, X., & BROWN, C. M. 2004. Learning multi-label scene classiÿcation. 37, 1757–1771. https://doi.org/10.1016/j.patcog.2004.03.009

BRANNON, R. M., LIENGME, B. V, KRITHIKA, L. B., ROY, P., & JERLIN, M. A. 2017. Topic Identification and Categorization of Public Information in Community-Based Social Media Topic Identification and Categorization of Public Information in Community-Based Social Media. https://doi.org/10.1088/1742-6596/755/1/011001

CHAMBY-DIAZ, J. C., & BAZZAN, A. L. C. 2019. Identifying traffic event types from twitter by multi-label classification. Proceedings - 2019 Brazilian Conference on Intelligent Systems, BRACIS 2019, 806–811. https://doi.org/10.1109/BRACIS.2019.00144

DABIRI, S., & HAESLIP, K. 2019. Developing a Twitter-based traffic event detection model using deep learning architectures. Expert Systems with Applications, 118, 425–439. https://doi.org/10.1016/j.eswa.2018.10.017

GU, Y., QIAN, Z., & CHEN, F. 2016. From Twitter to detector: Real-time traffic incident detection using social media data. Transportation Research Part C: Emerging Technologies, 67, 321–342. https://doi.org/10.1016/j.trc.2016.02.011

JANG, B., KIM, I., & KIM, J. W. 2019. Word2vec convolutional neural networks for classification of news articles and tweets. PLoS ONE, 14(8), 1–20. https://doi.org/10.1371/journal.pone.0220976

JUANLIN, HU, XIN KANG, SHUN NISHIDE, F. R. 2019. Text multi-label sentiment analysis based on Bi-LSTM. Proc. of CCIS, 16–20.

KHATTAK, F. K., JEBLEE, S., POU-PROM, C., ABDALLA, M., MEANEY, C., & RUDZICZ, F. 2019. A survey of word embeddings for clinical text. Journal of Biomedical Informatics: X, 4(October), 100057. https://doi.org/10.1016/j.yjbinx.2019.100057

KUYUMCU, B., AKSAKALLIL, C., & DELIL, S. 2019. An automated new approach in fast text classification (fastText): A case study for Turkish text classification without pre-processing. ACM International Conference Proceeding Series, 1–4. https://doi.org/10.1145/3342827.3342828

LIM, E., ISTTS, T. I., SETIAWAN, E. I., & ISTTS, T. I. 2020. Stance Classification Post Kesehatan di Media Sosial Dengan FastText Embedding dan Deep learning. 65–73.

MAZOYER, B., CAGE, J., HERVE, N., & HUDELOT, C. 2020. A french corpus for event detection on twitter. LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings, (May), 6220–6227.

MOHAMMAD, F. 2018. Is preprocessing of text really worth your time for toxic comment classification? 2018 World Congress in Computer Science, Computer Engineering and Applied Computing, CSCE 2018 - Proceedings of the 2018 International Conference on Artificial Intelligence, ICAI 2018, 447–453.

PARWEZ, M. A., ABDULISH, M., & JAHIRUDDIN. 2019. Multi-Label Classification of Microblogging Texts Using Convolution Neural Network. IEEE Access, 7, 68678–68691.

https://doi.org/10.1109/ACCESS.2019.2919494

RAHMAWATI D., & KHODRA, M. L. 2016. Word2vec semantic repesentation in multi-label classification for Indonesian news article. 4th IGNITE Conference and 2016 International Conference on Advanced Informatics: Concepts, Theory and Application, ICAICTA 2016, 0–5. https://doi.org/10.1109/ICAICTA.2016.7803115

RAO, A., & SPASOJEVIC, N. 2016. Actionable and Political Text Classification using Word embeddings and LSTM. Research Gate, (July), 2–10.

Unduhan

Diterbitkan

07-02-2022

Terbitan

Bagian

Ilmu Komputer

Cara Mengutip

Deteksi Kejadian Lalu Lintas pada Teks Twitter dengan Pendekatan Klasidikasi Multi-Label Berbasis Deep Learning. (2022). Jurnal Teknologi Informasi Dan Ilmu Komputer, 9(1), 87-96. https://doi.org/10.25126/jtiik.2022915206