Student Grouping Based on Grades and Attendance Using K-Means

Penulis

  • Theresya Simanjuntak Universitas Katolik Santo Thomas Medan
  • Jelita Astrid Gulo Universitas Katolik Santo Thomas Medan
  • Sardo Pardingotan Sipayung Universitas Katolik Santo Thomas Medan

DOI:

https://doi.org/10.55123/jomlai.v5i1.7283

Kata Kunci:

K-Means, Clustering, Data Mining, Students, Visualization

Abstrak

Student grouping based on academic performance is needed to support decision-making in more targeted academic guidance programs. This research implemented K-Means Clustering algorithm to group students based on academic scores and attendance rates. The dataset consisted of 50 student samples with score and attendance percentage attributes ranging from 0-100. Optimal cluster determination used Elbow Method and Silhouette Score with K values varying from 2 to 6. Experimental results showed K=3 produced optimal separation with highest Silhouette Score of 0.72 and WCSS 8,230. Three clusters formed represented high-achieving students (30%), average-performing students (40%), and students requiring special attention (30%). The algorithm converged in average of 8-12 iterations with 90% consistency on multiple runs. Correlation analysis showed very strong relationship between scores and attendance (r=0.89). Interactive visualization system was developed using React.js and Recharts to facilitate result interpretation. This research provided practical contribution in form of clustering framework for early warning identification of at-risk students and academic intervention program recommendations.

Referensi

[1] C. Romero and S. Ventura, "Educational Data Mining: A Review of the State of the Art," IEEE Transactions on Systems, Man, and Cybernetics, Part C, vol. 40, no. 6, pp. 601-618, November 2010.

[2] M. Credé, S. G. Roch, and U. M. Kieszczynka, "Class Attendance in College: A Meta-Analytic Review of the Relationship of Class Attendance with Grades and Student Characteristics," Review of Educational Research, vol. 80, no. 2, pp. 272-295, June 2010.

[3] R. Moore, M. Jensen, J. Hatch, I. Duranczyk, S. Staats, and L. Koch, "Showing Up: The Importance of Class Attendance for Academic Success in Introductory Science Courses," The American Biology Teacher, vol. 65, no. 5, pp. 325-329, May 2003.

[4] R. S. Baker and K. Yacef, "The State of Educational Data Mining in 2009: A Review and Future Visions," Journal of Educational Data Mining, vol. 1, no. 1, pp. 3-17, 2009.

[5] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, 3rd ed. Waltham: Morgan Kaufmann Publishers, 2012.

[6] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," in Proc. Fifth Berkeley Symp. on Mathematical Statistics and Probability, vol. 1, no. 14, 1967, pp. 281-297.

[7] A. K. Jain, "Data Clustering: 50 Years Beyond K-means," Pattern Recognition Letters, vol. 31, no. 8, pp. 651-666, June 2010.

[8] V. Kumar and A. Chadha, "An Empirical Study of the Applications of Data Mining Techniques in Higher Education," International Journal of Advanced Computer Science and Applications, vol. 2, no. 3, pp. 80-84, March 2012.

[9] A. Dutt, M. A. Ismail, and T. Herawan, "A Systematic Review on Educational Data Mining," IEEE Access, vol. 5, pp. 15991-16005, 2017.

[10] D. Xu and Y. Tian, "A Comprehensive Survey of Clustering Algorithms," Annals of Data Science, vol. 2, no. 2, pp. 165-193, June 2015.

[11] P. N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, 2nd ed. London: Pearson Education, 2016.

[12] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. Burlington: Morgan Kaufmann, 2016.

[13] D. Arthur and S. Vassilvitskii, "K-means++: The Advantages of Careful Seeding," in Proc. Eighteenth Annual ACM-SIAM Symp. on Discrete Algorithms, 2007, pp. 1027-1035.

[14] A. Likas, N. Vlassis, and J. J. Verbeek, "The Global K-means Clustering Algorithm," Pattern Recognition, vol. 36, no. 2, pp. 451-461, February 2003.

[15] P. J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, November 1987.

[16] M. E. Celebi, H. A. Kingravi, and P. A. Vela, "A Comparative Study of Efficient Initialization Methods for the K-means Clustering Algorithm," Expert Systems with Applications, vol. 40, no. 1, pp. 200-210, January 2013.

[17] J. M. Peña, J. A. Lozano, and P. Larrañaga, "An Empirical Comparison of Four Initialization Methods for the K-Means Algorithm," Pattern Recognition Letters, vol. 20, no. 10, pp. 1027-1040, October 1999.

[18] A. Fahad, N. Alshatri, Z. Tari, et al., "A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis," IEEE Transactions on Emerging Topics in Computing, vol. 2, no. 3, pp. 267-279, September 2014.

[19] C. C. Aggarwal and C. K. Reddy, Data Clustering: Algorithms and Applications. Boca Raton: CRC Press, 2013.

[20] D. Newman and M. L. Pearn, "Applying Data Mining to Quality Control: A Case Study," IEEE Transactions on Knowledge and Data Engineering, vol. 10, no. 5, pp. 805-815, September 1998.

Diterbitkan

2026-03-15

Cara Mengutip

Theresya Simanjuntak, Jelita Astrid Gulo, & Sardo Pardingotan Sipayung. (2026). Student Grouping Based on Grades and Attendance Using K-Means. JOMLAI: Journal of Machine Learning and Artificial Intelligence, 5(1), 1–8. https://doi.org/10.55123/jomlai.v5i1.7283

Terbitan

Bagian

Articles