Analisis Determinan Karakter Siswa Menggunakan Explainable Machine Learning (SHAP) dan Klasterisasi Profil Sekolah Studi Kasus Rapor Pendidikan Provinsi Bali
DOI:
https://doi.org/10.53863/kst.v7i02.1988Kata Kunci:
Educational Data Mining, Random Forest, SHAP, Karakter Siswa, Rapor PendidikanAbstrak
Penguatan karakter siswa merupakan indikator kinerja utama dalam kurikulum Merdeka Belajar, namun identifikasi faktor determinan lingkungan sekolah yang paling berpengaruh terhadap capaian karakter seringkali masih bersifat asumtif. Penelitian ini bertujuan untuk mendekonstruksi pola hubungan antara iklim lingkungan sekolah dengan kualitas karakter siswa di Provinsi Bali secara kuantitatif. Menggunakan dataset Rapor Pendidikan Indonesia yang dirilis oleh Kementerian Pendidikan Dasar dan Menengah (Kemendikdasmen) periode 2023-2025 dengan total 727 entri data, penelitian ini menerapkan metodologi Educational Data Mining dengan algoritma Random Forest yang diperkuat dengan Synthetic Minority Over-sampling Technique (SMOTE) untuk menangani ketimpangan data. Kebaruan (novelty) penelitian ini terletak pada penggunaan SHapley Additive exPlanations (SHAP) untuk transparansi model dan K-Means Clustering untuk pemetaan zonasi. Hasil eksperimen menunjukkan model mampu memprediksi capaian karakter dengan akurasi 77,03%. Analisis SHAP mengungkap temuan menarik bahwa Iklim Kebinekaan (skor pengaruh 0,45) dan Iklim Kesetaraan Gender (0,22) adalah prediktor terkuat, jauh melampaui pengaruh Iklim Keamanan (0,13). Temuan ini membantah asumsi umum bahwa keamanan fisik adalah faktor tunggal terpenting. Lebih lanjut, analisis klasterisasi mengidentifikasi tiga tipologi sekolah di Bali, termasuk satu klaster "Rawan" yang memiliki skor kritis pada aspek kesetaraan gender dan kebinekaan meskipun memiliki skor keamanan yang memadai. Penelitian ini merekomendasikan pergeseran fokus kebijakan pendidikan di Bali dari pendekatan keamanan fisik menuju penguatan program toleransi dan kesetaraan gender yang terbukti memiliki dampak statistik lebih signifikan.
Referensi
Abdollahi, A. (2023). Explainable artificial intelligence (XAI) for interpreting the contributing factors feed into the wildfire susceptibility prediction model. Science of the Total Environment, 879. https://doi.org/10.1016/j.scitotenv.2023.163004
Adhi, M. K. (2020). THE TRANSFORMATION OF BALINESE SATUA VALUES: STRENGTHENING THE CHARACTER EDUCATION OF THE ALPHA GENERATION (A Case study at Saraswati Tabanan Kindergarten, Bali). Jurnal Ilmiah Peuradeun, 8(2), 279–298. https://doi.org/10.26811/peuradeun.v8i2.420
Alam, A. (2023). The Secret Sauce of Student Success: Cracking the Code by Navigating the Path to Personalized Learning with Educational Data Mining. 2023 2nd International Conference on Smart Technologies and Systems for Next Generation Computing Icstsn 2023, https://doi.org/10.1109/ICSTSN57873.2023.10151558
Ali, R. H. (2022). Educational Data Mining For Predicting Academic Student Performance Using Active Classification. Iraqi Journal of Science, 63(9), 3954–3965. https://doi.org/10.24996/ijs.2022.63.9.27
Al-Najjar, H. A. H. (2023). A novel method using explainable artificial intelligence (XAI)-based SHapley Additive exPlanations for spatial landslide prediction using Time-Series SAR dataset. Gondwana Research, 123, 107–124. https://doi.org/10.1016/j.gr.2022.08.004
An, C. (2021). A K-means Improved CTGAN Oversampling Method for Data Imbalance Problem. IEEE International Conference on Software Quality Reliability and Security Qrs, 2021, 883–887. https://doi.org/10.1109/QRS54544.2021.00097
Arafa, A. (2022). RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification. Journal of King Saud University Computer and Information Sciences, 34(8), 5059–5074. https://doi.org/10.1016/j.jksuci.2022.06.005
Arun, D. K. (2021). Student academic performance prediction using educational data mining. 2021 International Conference on Computer Communication and Informatics, ICCCI 2021, 2021. https://doi.org/10.1109/ICCCI50826.2021.9457021
Badhon, B. (2025). A Multi-Module Explainable Artificial Intelligence Framework for Project Risk Management: Enhancing Transparency in Decision-making. Engineering Applications of Artificial Intelligence, 148. https://doi.org/10.1016/j.engappai.2025.110427
Chang, C. C. (2015). Second-generation PLINK: Rising to the challenge of larger and richer datasets. Gigascience, 4(1). https://doi.org/10.1186/s13742-015-0047-8
Compton, T. (2025). Beyond the Black Box: Integrating Lexical and Semantic Methods in Quantitative Discourse Analysis with BERTopic. arXiv Preprint arXiv:2508.19099, https://arxiv.org/abs/2508.19099
Data Rapor Pendidikan Indonesia. (2025). [Dataset].
Doz, D. (2024). Factors affecting students’ performance on national assessments of mathematics in Italy: A random forest approach. Assessment in Education: Principles, Policy and Practice, 31(5), 325–352. https://doi.org/10.1080/0969594X.2025.2457687
Fergusson, L. (2022). Consciousness-based education in Bali: A second- and third-person embedded multiple-case study of Negeri Bali Mandara. Asia Pacific Journal of Education, 42, 88–104. https://doi.org/10.1080/02188791.2021.1898932
Gebreyesus, Y. (2023). Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP). Future Internet, 15(3). https://doi.org/10.3390/fi15030088
Hamilton, R. I. (2024). Using SHAP Values and Machine Learning to Understand Trends in the Transient Stability Limit. IEEE Transactions on Power Systems, 39(1), 1384–1397. https://doi.org/10.1109/TPWRS.2023.3248941
Musa, A. B. (2024). Understanding Student Performance in Foundation Year: Insights from Logistic Regression, Naïve Bayes, and Random Forest Models. International Journal of Information and Education Technology, 14(12), 1716–1723. https://doi.org/10.18178/ijiet.2024.14.12.2202
Nitiasih, P. K. (2025). Future development of peace education in Bali: Lessons from a critical analysis of the peace education curricula of Hiroshima. Edelweiss Applied Science and Technology, 9(2), 37–50. https://doi.org/10.55214/25768484.v9i2.4427
Song, Z. (2023). Prediction for CET-4 Based on Random Forest. Procedia Computer Science, 228, 429–437. https://doi.org/10.1016/j.procs.2023.11.049
Widana, I. W. (2023). The special education teachers’ ability to develop an integrated learning evaluation of Pancasila student profiles based on local wisdom for special needs students in Indonesia. Kasetsart Journal of Social Sciences, 44(2), 527–536. https://doi.org/10.34044/j.kjss.2023.44.2.23
Yang, X. (2022). Research on Forecasting of Student Grade Based on Adaptive K-Means and Deep Neural Network. Wireless Communications and Mobile Computing, 2022. https://doi.org/10.1155/2022/5454158
Zeng, G. (2020). On the confusion matrix in credit scoring and its analytical properties. Communications in Statistics Theory and Methods, 49(9), 2080–2093. https://doi.org/10.1080/03610926.2019.1568485
Unduhan
Diterbitkan
Cara Mengutip
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2025 Md. Wira Putra Dananjaya, Ngakan Nyoman Kutha Krisnawijaya, Gede Humaswara Prathama, I Gusti Ngurah Darma Paramartha, Adie Wahyudi Oktavia Gama

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike 4.0 International License that allows others to share the work with an acknowledgment of the work’s authorship and initial publication in this journal
















