Hybrid Deep Learning and USE Algorithm for Essay Scoring: Accuracy and Performance Analysis

Authors

  • Agus Sriyanto Universitas Amikom Yogyakarta
  • Kusrini Universitas Amikom Yogyakarta, Indonesia

DOI:

10.33395/sinkron.v9i2.14784

Keywords:

Automated Essay Assessment; Cosine Similarity; Universal Sentence Encoder; Online Learning.

Abstract

The main challenge in digital education, particularly in the automatic assessment of essay answers in online learning systems, lies in the complexity of natural language understanding and semantic evaluation required to achieve the level of precision equivalent to human judgment. This study aims to develop and analyze the performance of a hybrid model that combines deep learning with a semantic similarity-based approach to essay auto-grading. The methods used include the collection of essay answer data from various disciplines, text processing to extract semantic representations, and the calculation of the degree of similarity between the participant's answers and the answer key using the similarity measure. The evaluation was carried out by comparing the results of automatic assessments with manual assessments by teachers. The results showed that the developed model achieved the highest accuracy level of 90.22% at 0.8 treshold, with a precision of 84.63%, a recall of 100%, and an F1 score of 91.68%. To strengthen the reliability of the findings, statistical validation was carried out using error evaluation metrics. RMSE value is 0.32 and RMAE value is 0.19. These findings show that the model is able to mimic human judgment reliably and consistently, and can distinguish linguistic variations that arise in different types of essay questions. This system offers an effective solution for the automation of assessments in an online learning environment, while maintaining the integrity and objectivity of the evaluation.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Ahmad, I. F., Setiawati, F. A., Prihatin, R. P., Fitriyah, Q. F., & Thontowi, Z. S. (2024). Technology-Based Learning Effect on the Learning Outcomes of Indonesian Students: A Meta-Analysis. International Journal of Evaluation and Research in Education (Ijere). https://doi.org/10.11591/ijere.v13i2.25383

Amalia, E. L., Jumadi, A. J., Mashudi, I. A., & Wibowo, D. W. (2021). Analisis Metode Cosine Similarity Pada Aplikasi Ujian Online Otomatis (Studi Kasus JTI POLINEMA). Jurnal Teknologi Informasi Dan Ilmu Komputer. https://doi.org/10.25126/jtiik.2021824356

Arfandy, H., & Musdar, I. A. (2020). Rancang Bangun Sistem Cerdas Pemberian Nilai Otomatis Untuk Ujian Essai Menggunakan Algoritma Cosine Similarity. Inspiration Jurnal Teknologi Informasi Dan Komunikasi, 10(2), 123. https://doi.org/10.35585/inspir.v10i2.2580

Aulianda, N., Wijayati, P. H., Ebner, M., & Schön, S. (2023). Analysis of Learning Management System Towards Students’ Cognitive Learning Outcome. International Journal of Emerging Technologies in Learning (Ijet). https://doi.org/10.3991/ijet.v18i23.36443

Beseiso, M., & Alzahrani, S. (2020). An Empirical Analysis of BERT Embedding for Automated Essay Scoring. In International Journal of Advanced Computer Science and Applications. https://doi.org/10.14569/ijacsa.2020.0111027

Birla, N., Kumar Jain, M., & Panwar, A. (2022). Automated assessment of subjective assignments: A hybrid approach. Expert Systems with Applications, 203, 117315. https://doi.org/10.1016/j.eswa.2022.117315

Bohara, B. (2020). Adaptive Threshold for Online Object Recognition and Re-identification Tasks. http://arxiv.org/abs/2012.14305

Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., John, R. St., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., Sung, Y.-H., Strope, B., & Kurzweil, R. (2018). Universal Sentence Encoder. http://arxiv.org/abs/1803.11175

Chamidah, N., Yulianti, E., & Budi, I. (2023). Evaluating the Impact of Sentence Tokenization on Indonesian Automated Essay Scoring Using Pretrained Sentence Embeddings. Revue d’Intelligence Artificielle, 37(5), 1101–1108. https://doi.org/10.18280/ria.370502

Chansanam, W., Poonpon, K., Manakul, T., & Detthamrong, U. (2021). Success and Challenges in MOOCs: A Literature Systematic Review Technique. TEM Journal, 1728–1732. https://doi.org/10.18421/TEM104-32

Fajari, A. N., & A. Baizal, Z. K. (2022). Chatbot-Based Culinary Tourism Recommender System Using Named Entity Recognition. Jipi (Jurnal Ilmiah Penelitian Dan Pembelajaran Informatika). https://doi.org/10.29100/jipi.v7i4.3210

Hussein, M., Hassan, H., & Nassef, M. (2019). Automated Language Essay Scoring Systems: A Literature Review. In Peerj Computer Science. https://doi.org/10.7717/peerj-cs.208

Koseoglu, B., Traverso, L., Topiwalla, M., Kraev, E., & Szopory, Z. (2024). OTLP: Output Thresholding Using Mixed Integer Linear Programming. http://arxiv.org/abs/2405.11230

Kumar, V., & Boulanger, D. (2021). Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined? International Journal of Artificial Intelligence in Education, 31(3), 538–584. https://doi.org/10.1007/S40593-020-00211-5

Kumari, P., & Seeja, K. R. (2021). A novel periocular biometrics solution for authentication during Covid-19 pandemic situation. Journal of Ambient Intelligence and Humanized Computing, 12(11), 1–17. https://doi.org/10.1007/S12652-020-02814-1

Lahitani, A. R. (2022). Automated Essay Scoring Menggunakan Cosine Similarity Pada Penilaian Esai Multi Soal. Jurnal Kajian Ilmiah. https://doi.org/10.31599/jki.v22i2.1121

Mahdi, H. S., & Alkhateeb, A. A. (2025). Revolutionising Essay Evaluation. International Journal of Computer-Assisted Language Learning and Teaching. https://doi.org/10.4018/ijcallt.368226

Mirda, M., Sarjan, M., & Khairat, U. (2022). Aplikasi Ujian Essay Koreksi Otomatis Menggunkan Metode Cosine Similarity. Journal Peqguruang Conference Series. https://doi.org/10.35329/jp.v4i1.2344

Mujianto, A. H., Aswin, I. B., & Susanto, E. S. (2024). Implementasi Algoritma Cosine Similarity Untuk Koreksi Jawaban Ujian Essay Berbasis Website. Jurnal Informatika Teknologi Dan Sains. https://doi.org/10.51401/jinteks.v6i3.4214

Pradani, K. A., & Suadaa, L. H. (2023). Automated Essay Scoring Menggunakan Semantic Textual Similarity Berbasis Transformer Untuk Penilaian Ujian Esai. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(6), 1177–1184. https://doi.org/10.25126/jtiik.2023107338

Shin, J., & Gierl, M. J. (2021). More efficient processes for creating automated essay scoring frameworks: A demonstration of two algorithms. Language Testing, 38(2), 247–272. https://doi.org/10.1177/0265532220937830

Sitarz, M. (2022). Extending F1 metric, probabilistic approach. https://doi.org/10.54364/AAIML.2023.1161

Slimi, Z., & Villarejo-Carballido, B. (2023). Systematic Review: AI’s Impact on Higher Education - Learning, Teaching, and Career Opportunities. Tem Journal. https://doi.org/10.18421/tem123-44

Wang, Y., Wang, C., Li, R., & Lin, H. (2022). On the Use of Bert for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 3416–3425. https://doi.org/10.18653/v1/2022.naacl-main.249

Wong, W. S., & Bong, C. H. (2021). Assessing Malaysian University English Test (MUET) Essay on Language and Semantic Features Using Intelligent Essay Grader (IEG). Pertanika Journal of Science and Technology. https://doi.org/10.47836/pjst.29.2.12

Zhang, C., Li, Q., & Cheng, X. (2020). Text Sentiment Classification Based on Feature Fusion. In Revue D Intelligence Artificielle. https://doi.org/10.18280/ria.340418

Zhang, Y., Lin, C., & Chi, M. (2020). Going deeper: Automatic short-answer grading by combining student and question models. User Modeling and User-Adapted Interaction, 30(1), 51–80. https://doi.org/10.1007/s11257-019-09251-6

Downloads


Crossmark Updates

How to Cite

Sriyanto, A., & Kusrini , K. . (2025). Hybrid Deep Learning and USE Algorithm for Essay Scoring: Accuracy and Performance Analysis. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 9(2), 914-924. https://doi.org/10.33395/sinkron.v9i2.14784