Comparative Sentiment Analysis of Election News Articles with Smote using Classification Algorithm

Authors

  • Fathir Fathir Universitas Muhammadiyah Bima, Bima, Indonesia
  • Afsa Rizki Universitas Muhammadiyah Bima, Bima, Indonesia
  • Yuliyanti Yuliyanti Universitas Muhammadiyah Bima, Bima, Indonesia
  • Siti Mutmainah Universitas Muhammadiyah Bima, Bima, Indonesia

DOI:

https://doi.org/10.53863/kst.v6i02.1253

Keywords:

sentiment analysis, SMOTE, KNN, decision tree

Abstract

This research focuses on sentiment analysis of news articles about general elections, especially the president and vice president by comparing the performance of classification algorithms, especially Decision Tree and K-Nearest Neighbors (KNN), and evaluating the effectiveness of the SMOTE (Synthetic Minority Over-sampling Technique) technique in overcoming the problem of data imbalance or the dataset shows that the amount of data that has positive sentiment is more than negative sentiment. The main objective of this research is to determine which algorithm is superior in sentiment classification and see how SMOTE can improve the performance of the model. The dataset was scraped and subjected to text normalization, stop words removal, and feature extraction. SMOTE was applied to balance the classes in the dataset, thus overcoming the imbalance that often occurs in sentiment data. Decision Tree and KNN algorithms were used. The results showed that Decision Tree consistently performed better than KNN in terms of 85% accuracy, 44% precision, 47% recall, and 45% F1 score. The application of SMOTE is proven to improve the performance of both algorithms, but the effect is more significant on Decision Tree. Thus, this study concludes that Decision Tree, combined with SMOTE, is a more effective and reliable approach for sentiment analysis of election articles than KNN. These results make an important contribution to the development of sentiment analysis methods that can be applied to understand the dynamics of public opinion in a political context.

References

A’yuniyah, Q. A., & Reza, M. (2023). Penerapan Algoritma K-Nearest Neighbor Untuk Klasifikasi Jurusan Siswa Di Sma Negeri 15 Pekanbaru. Indonesian Journal of Informatic Research and Software Engineering (IJIRSE), 3(1), 39–45. https://doi.org/10.57152/ijirse.v3i1.484

Al-Azani, S., & El-Alfy, E. S. M. (2017). Using Word Embedding and Ensemble Learning for Highly Imbalanced Data Sentiment Analysis in Short Arabic Text. Procedia Computer Science, 109, 359–366. https://doi.org/10.1016/j.procs.2017.05.365

Azhar, Y. (2017). METODE LEXICON-LEARNING BASED UNTUK IDENTIFIKASI TWEET OPINI BERBAHASA INDONESIA. In Jurnal Nasional Pendidikan Teknik Informatika | (Vol. 6, Issue 3).

Azhar, Y. (2018). Metode Lexicon-Learning Based Untuk Identifikasi Tweet Opini Berbahasa Indonesia. Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), 6(3), 237. https://doi.org/10.23887/janapati.v6i3.11739

Eddyono, A. S. (2021). Pers Alternatif pada Era Orde Baru: Dijinakkan hingga Dibungkam. Komunika, 8(1), 53–60.

Es-Sabery, F., Es-Sabery, K., Qadir, J., Sainz-De-Abajo, B., Hair, A., García-Zapirain, B., & De La Torre-Díez, I. (2021). A MapReduce Opinion Mining for COVID-19-Related Tweets Classification Using Enhanced ID3 Decision Tree Classifier. IEEE Access, 9, 58706–58739. https://doi.org/10.1109/ACCESS.2021.3073215

Habibah, A. F. (2021). Era masyarakat informasi sebagai dampak media baru. Jurnal Teknologi Dan Sistem Informasi Bisnis, 3(2), 350–363.

Harun, A., & Putri Ananda, D. (2021). Analisa Sentimen Opini Publik Tentang Vaksinasi Covid-19 di Indonesia Menggunakan Naïve bayes dan Decission Tree. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 1(1), 58–64. https://doi.org/10.57152/malcom.v1i1.63

Hasan, F. N. (2024). Analisis Sentimen Masyarakat Terhadap Fenomena Childfree ( Kehidupan Tanpa Anak ) Pada Twitter Menggunakan Algoritma Naïve Bayes. 5(3), 853–861. https://doi.org/10.47065/josh.v5i3.5064

Ibrahim, N. M., Yafooz, W. M. S., Emara, A. H. M., & Abdel-Wahab, A. (2022). Utilizing Deep Learning in Arabic Text Classification Sentiment Analysis of Twitter. International Journal of Advanced Computer Science and Applications, 13(12), 830–838. https://doi.org/10.14569/IJACSA.2022.0131297

Keputusan Dirjen Penguatan Riset dan Pengembangan Ristek Dikti, S., Ari Kristanto, A., Harjoseputro, Y., Eric Samodra, J., & Jaya Yogyakarta yuliusharjoseputro, A. (2017). Terakreditasi SINTA Peringkat 2 Implementasi Golang dan New Simple Queue pada Sistem Sandbox Pihak Ketiga Berbasis REST API. Masa Berlaku Mulai, 1(3), 745–750.

Kholifah, B., Thoib, I., Sururi, N., & Kurnia, N. D. (2024). Analisis Sentimen Warganet Terhadap Isu Layanan Transportasi Online Berbasis InSet Lexicon Menggunakan Logistic Regression. 11(1), 14–25.

Lase, S. M. N., Adinda, A., Yuliantika, R. D., & Al, E. (2021). Kerangka Hukum Teknologi Blockchain Berdasarkan Hukum Siber di Indonesia. Padjajaran Law Review, 9(1), 1–20. https://hbr.org/2017/02/a-brief-history-of-

Lazuardi, J. U. S., & Juarna, A. (2023). Analisis Sentimen Ulasan Pengguna Aplikasi Joox Pada Android Menggunakan Metode Bidirectional Encoder Representation From Transformer (Bert). Jurnal Ilmiah Informatika Komputer, 28(3), 251–260. https://doi.org/10.35760/ik.2023.v28i3.10090

Loka, S. K. P., & Marsal, A. (2023). Perbandingan Algoritma K-Nearest Neighbor dan Naïve Bayes Classifier untuk Klasifikasi Status Gizi Pada Balita. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(1), 8–14. https://doi.org/10.57152/malcom.v3i1.474

Mohasseb, A., Bader-El-Den, M., Cocea, M., & Liu, H. (n.d.). IMPROVING IMBALANCED QUESTION CLASSIFICATION USING STRUCTURED SMOTE BASED APPROACH. http://trec.nist.gov/data/qa/t2007_qadata.html

Mudjiyanto, B., & Dunan, A. (2020). Media mainstream jadi rujukan media sosial. Majalah Semi Ilmiah Populer Komunikasi Massa, 1(01).

Mustasaruddin, M., Budianita, E., Fikry, M., & Yanto, F. (2023). Klasifikasi Sentiment Review Aplikasi MyPertamina Menggunakan Word Embedding FastText dan SVM (Support Vector Machine). Jurnal Sistem Komputer Dan Informatika (JSON), 4(3), 526. https://doi.org/10.30865/json.v4i3.5695

Nooryuda Prasetya, Y., Winarso, D., & Syahril. (2021). Penerapan Lexicon Based Untuk Analisis Sentimen Pada TwiterTerhadap Isu Covid-19. Jurnal Fasilkom, 11(2), 97–103.

Putra, F., Tahiyat, H. F., Ihsan, R. M., Rahmaddeni, R., & Efrizoni, L. (2024). Penerapan Algoritma K-Nearest Neighbor Menggunakan Wrapper Sebagai Preprocessing untuk Penentuan Keterangan Berat Badan Manusia. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 4(1), 273–281. https://doi.org/10.57152/malcom.v4i1.1085

Putu, N. L. P. M., Ahmad Zuli Amrullah, & Ismarmiaty. (2021). Analisis Sentimen dan Pemodelan Topik Pariwisata Lombok Menggunakan Algoritma Naive Bayes dan Latent Dirichlet Allocation. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(1), 123–131. https://doi.org/10.29207/resti.v5i1.2587

Satrio, B., Dahlan, B. F., Fathan, F., Muwafa, F. Z., & Reyhan, M. (2024). Klasifikasi Sentimen Emosi Pada Dataset Goemotion Menggunakan LSTM. 7(1), 21–25.

Setiawan, S. B., & Isnain, A. R. (2024). Sentimen Analisis Masyarakat Terhadap Pembangunan IKN Menggunakan Algoritma Lexicon Based Approach dan Naïve Bayes. 8(April 2019), 1019–1030. https://doi.org/10.30865/mib.v8i2.7605

Shahriar, K. T., Islam, M. N., Moni, M. A., & Sarker, I. H. (2023). A dynamic topic identification and labeling approach for COVID-19 tweets. Applied Intelligence for Industry 4.0, December 2019, 227–239. https://doi.org/10.1201/9781003256083-18

Supriatna, R., & Rohman, D. (2024). PENERAPAN NATURAL LANGUAGE PROCESSING DALAM ANALISIS SENTIMEN CAWAPRES 2024 MENGGUNAKAN ALGORITMA NAIVE BAYES. 8(1), 1109–1115.

Surya Gemilang, W., Purwantoro, P., & Carudin, C. (2024). Analisis Sentimen Pengguna Instagram Pada Calon Presiden 2024 Menggunakan Algoritma Support Vector Machine. JATI (Jurnal Mahasiswa Teknik Informatika), 7(4), 2849–2855. https://doi.org/10.36040/jati.v7i4.7256

Syamala, M., & Nalini, N. J. (2020). A filter based improved decision tree sentiment classification model for real-time amazon product review data. International Journal of Intelligent Engineering and Systems, 13(1), 191–202. https://doi.org/10.22266/ijies2020.0229.18

Toruan, C. R. A., Yudistra, N., & Perdana, R. S. (2023). Analisis Sentimen Tokocrypto pada Twitter menggunakan Metode Long Short-Term Memory. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 7(2), 719–726. http://j-ptiik.ub.ac.id

Utomo, P. B., Wahyudi, D., & Nalendra, A. K. (2024). Implementasi Convolution-Augmented Transfomer Berbasis Kecerdasan Buatan dalam Analisis Sentimen Teks Hasil Konversi Suara ke Teks. 8(1), 63–71.

Winarso, D., Yanda Noor Yudha, & Syahril. (2021). Analisis Sentimen Masyarakat Pada Twiter Terhadap Isu Covid-19 Menggunakan Metode Lexicon Based. Jurnal Fasilkom, 11(2), 97–103. https://doi.org/10.37859/jf.v11i2.2772

Yang, S. (2018). Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis. International Journal of Computer and Information Engineering, 12(7), 525–529.

Yani, S., Jumeilah, F. S., & Kadafi, M. (2020). Algoritma K-Nearest Neighbor Untuk Menentukan Kelayakan Keluarga Penerima Bantuan Pangan Non Tunai (Studi Kasus?: Kelurahan Karya Jaya). Journal of Information Technology Ampera, 1(2), 75–87. https://doi.org/10.51519/journalita.volume1.isssue2.year2020.page75-87

Zelina, N., & Afiyati, A. (2024). Analisis Sentimen Ulasan Pengguna Aplikasi M- Banking Menggunakan Algoritma Support Vector Machine dan Decision Tree. 7(1), 31–37.

Published

2024-07-30

How to Cite

Fathir, F., Rizki, A., Yuliyanti, Y., & Mutmainah, S. (2024). Comparative Sentiment Analysis of Election News Articles with Smote using Classification Algorithm. JURNAL KRIDATAMA SAINS DAN TEKNOLOGI, 6(02), 441–452. https://doi.org/10.53863/kst.v6i02.1253

Similar Articles

1 2 3 4 5 6 > >> 

You may also start an advanced similarity search for this article.