Analisis Determinan Karakter Siswa Menggunakan Explainable Machine Learning (SHAP) dan Klasterisasi Profil Sekolah Studi Kasus Rapor Pendidikan Provinsi Bali

Penulis

  • Md. Wira Putra Dananjaya Universitas Pendidikan Nasional, Denpasar, Indonesia
  • Ngakan Nyoman Kutha Krisnawijaya Universitas Pendidikan Nasional, Denpasar, Indonesia
  • Gede Humaswara Prathama Universitas Pendidikan Nasional, Denpasar, Indonesia
  • I Gusti Ngurah Darma Paramartha Universitas Pendidikan Nasional, Denpasar, Indonesia
  • Adie Wahyudi Oktavia Gama Universitas Pendidikan Nasional, Denpasar, Indonesia

DOI:

https://doi.org/10.53863/kst.v7i02.1988

Kata Kunci:

Educational Data Mining, Random Forest, SHAP, Karakter Siswa, Rapor Pendidikan

Abstrak

Penguatan karakter siswa merupakan indikator kinerja utama dalam kurikulum Merdeka Belajar, namun identifikasi faktor determinan lingkungan sekolah yang paling berpengaruh terhadap capaian karakter seringkali masih bersifat asumtif. Penelitian ini bertujuan untuk mendekonstruksi pola hubungan antara iklim lingkungan sekolah dengan kualitas karakter siswa di Provinsi Bali secara kuantitatif. Menggunakan dataset Rapor Pendidikan Indonesia yang dirilis oleh Kementerian Pendidikan Dasar dan Menengah (Kemendikdasmen) periode 2023-2025 dengan total 727 entri data, penelitian ini menerapkan metodologi Educational Data Mining dengan algoritma Random Forest yang diperkuat dengan Synthetic Minority Over-sampling Technique (SMOTE) untuk menangani ketimpangan data. Kebaruan (novelty) penelitian ini terletak pada penggunaan SHapley Additive exPlanations (SHAP) untuk transparansi model dan K-Means Clustering untuk pemetaan zonasi. Hasil eksperimen menunjukkan model mampu memprediksi capaian karakter dengan akurasi 77,03%. Analisis SHAP mengungkap temuan menarik bahwa Iklim Kebinekaan (skor pengaruh 0,45) dan Iklim Kesetaraan Gender (0,22) adalah prediktor terkuat, jauh melampaui pengaruh Iklim Keamanan (0,13). Temuan ini membantah asumsi umum bahwa keamanan fisik adalah faktor tunggal terpenting. Lebih lanjut, analisis klasterisasi mengidentifikasi tiga tipologi sekolah di Bali, termasuk satu klaster "Rawan" yang memiliki skor kritis pada aspek kesetaraan gender dan kebinekaan meskipun memiliki skor keamanan yang memadai. Penelitian ini merekomendasikan pergeseran fokus kebijakan pendidikan di Bali dari pendekatan keamanan fisik menuju penguatan program toleransi dan kesetaraan gender yang terbukti memiliki dampak statistik lebih signifikan.

Referensi

Abdollahi, A. (2023). Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Science of the Total Environment, 879. https://doi.org/10.1016/j.scitotenv.2023.163004

Adhi, M. K. (2020). THE TRANSFORMATION OF BALINESE SATUA VALUES: STRENGTHENING THE CHARACTER EDUCATION OF THE ALPHA GENERATION (A Case study at Saraswati Tabanan Kindergarten, Bali). Jurnal Ilmiah Peuradeun, 8(2), 279–298. https://doi.org/10.26811/peuradeun.v8i2.420

Alam, A. (2023). The Secret Sauce of Student Success: Cracking the Code by Navigating the Path to Personalized Learning with Educational Data Mining. 2023 2nd International Conference on Smart Technologies and Systems for Next Generation Computing Icstsn 2023, https://doi.org/10.1109/ICSTSN57873.2023.10151558

Ali, R. H. (2022). Educational Data Mining For Predicting Academic Student Performance Using Active Classification. Iraqi Journal of Science, 63(9), 3954–3965. https://doi.org/10.24996/ijs.2022.63.9.27

Al-Najjar, H. A. H. (2023). A novel method using explainable artificial intelligence (XAI)-based SHapley Additive exPlanations for spatial landslide prediction using Time-Series SAR dataset. Gondwana Research, 123, 107–124. https://doi.org/10.1016/j.gr.2022.08.004

An, C. (2021). A K-means Improved CTGAN Oversampling Method for Data Imbalance Problem. IEEE International Conference on Software Quality Reliability and Security Qrs, 2021, 883–887. https://doi.org/10.1109/QRS54544.2021.00097

Arafa, A. (2022). RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification. Journal of King Saud University Computer and Information Sciences, 34(8), 5059–5074. https://doi.org/10.1016/j.jksuci.2022.06.005

Arun, D. K. (2021). Student academic performance prediction using educational data mining. 2021 International Conference on Computer Communication and Informatics, ICCCI 2021, 2021. https://doi.org/10.1109/ICCCI50826.2021.9457021

Badhon, B. (2025). A Multi-Module Explainable Artificial Intelligence Framework for Project Risk Management: Enhancing Transparency in Decision-making. Engineering Applications of Artificial Intelligence, 148. https://doi.org/10.1016/j.engappai.2025.110427

Chang, C. C. (2015). Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience, 4(1). https://doi.org/10.1186/s13742-015-0047-8

Compton, T. (2025). Beyond the Black Box: Integrating Lexical and Semantic Methods in Quantitative Discourse Analysis with BERTopic. arXiv Preprint arXiv:2508.19099, https://arxiv.org/abs/2508.19099

Data Rapor Pendidikan Indonesia. (2025). [Dataset].

Doz, D. (2024). Factors affecting students’ performance on national assessments of mathematics in Italy: A random forest approach. Assessment in Education: Principles, Policy and Practice, 31(5), 325–352. https://doi.org/10.1080/0969594X.2025.2457687

Fergusson, L. (2022). Consciousness-based education in Bali: A second- and third-person embedded multiple-case study of Negeri Bali Mandara. Asia Pacific Journal of Education, 42, 88–104. https://doi.org/10.1080/02188791.2021.1898932

Gebreyesus, Y. (2023). Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP). Future Internet, 15(3). https://doi.org/10.3390/fi15030088

Hamilton, R. I. (2024). Using SHAP Values and Machine Learning to Understand Trends in the Transient Stability Limit. IEEE Transactions on Power Systems, 39(1), 1384–1397. https://doi.org/10.1109/TPWRS.2023.3248941

Musa, A. B. (2024). Understanding Student Performance in Foundation Year: Insights from Logistic Regression, Naïve Bayes, and Random Forest Models. International Journal of Information and Education Technology, 14(12), 1716–1723. https://doi.org/10.18178/ijiet.2024.14.12.2202

Nitiasih, P. K. (2025). Future development of peace education in Bali: Lessons from a critical analysis of the peace education curricula of Hiroshima. Edelweiss Applied Science and Technology, 9(2), 37–50. https://doi.org/10.55214/25768484.v9i2.4427

Song, Z. (2023). Prediction for CET-4 Based on Random Forest. Procedia Computer Science, 228, 429–437. https://doi.org/10.1016/j.procs.2023.11.049

Widana, I. W. (2023). The special education teachers’ ability to develop an integrated learning evaluation of Pancasila student profiles based on local wisdom for special needs students in Indonesia. Kasetsart Journal of Social Sciences, 44(2), 527–536. https://doi.org/10.34044/j.kjss.2023.44.2.23

Yang, X. (2022). Research on Forecasting of Student Grade Based on Adaptive K-Means and Deep Neural Network. Wireless Communications and Mobile Computing, 2022. https://doi.org/10.1155/2022/5454158

Zeng, G. (2020). On the confusion matrix in credit scoring and its analytical properties. Communications in Statistics Theory and Methods, 49(9), 2080–2093. https://doi.org/10.1080/03610926.2019.1568485

Unduhan

Diterbitkan

2025-12-17

Cara Mengutip

Dananjaya, M. W. P., Krisnawijaya, N. N. K., Prathama, G. H., Paramartha, I. G. N. D., & Gama, A. W. O. (2025). Analisis Determinan Karakter Siswa Menggunakan Explainable Machine Learning (SHAP) dan Klasterisasi Profil Sekolah Studi Kasus Rapor Pendidikan Provinsi Bali. Jurnal Kridatama Sains Dan Teknologi, 7(02), 936–948. https://doi.org/10.53863/kst.v7i02.1988

Artikel Serupa

1 2 3 4 5 6 7 8 9 10 > >> 

Anda juga bisa Mulai pencarian similarity tingkat lanjut untuk artikel ini.