Plagiarism Detection in Students' Theses Using The Cosine Similarity Method

Authors

  • Oppi Anda Resta STIKI Malang
  • Addin Aditya STIKI Malang
  • Febry Eka Purwiantono STIKI Malang

DOI:

10.33395/sinkron.v5i2.10909

Keywords:

Plagiarism, Text Mining, Cosine Similarity, TF-IDF, Student Theses

Abstract

The main requirement for graduation from students is to make a final scientific paper. One of the factors determining the quality of a student's scientific work is the uniqueness and innovation of the work. This research aims to apply data mining methods to detect similarities in titles, abstracts, or topics of students' final scientific papers so that plagiarism does not occur. In this research, the cosine similarity method is combined with the preprocessing method and TF-IDF to calculate the level of similarity between the title and the abstract of a student's final scientific paper, then the results will be displayed and compared with the existing final project repository based on the threshold value to make a decision whether scientific work can be accepted or rejected. Based on the test data and training data that has been applied to the TF-IDF method, it shows that the percentage level of similarity between the training data document and the test data document is 8%. This shows that the student thesis is still classified as unique and does not contain plagiarism content. The findings of this study can help the university in managing the administration of student theses so that plagiarism does not occur. Furthermore, it is necessary to study further adding methods to increase the accuracy of system performance so that when the process is run the system will work faster and optimally.

GS Cited Analysis

Downloads

Download data is not yet available.

Author Biographies

Oppi Anda Resta, STIKI Malang

Department of Informatics Engineering

Febry Eka Purwiantono, STIKI Malang

Department of Information Management

References

Biliæ-zulle, L., Frkoviæ, V., Turk, T., & Josip, A. (2005). Prevalence of Plagiarism among Medical Students. Croat Med J., 46(1), 126–131.

Ernawati, E., Nugroho, R., & Atmojo, P. (2014). SISTEM PENDETEKSI PLAGIARISME UNTUK TUGAS AKHIR MAHASISWA DI UNIVERSITAS BINA NUSANTARA : Jurnal Humaniora, 5(1), 541–549.

Francis, L., & Flynn, M. (2010). Text Mining Handbook.

Kurniasar. (2016). Upaya pencegahan dan penanggulangan plagiarisme di perguruan tinggi. Jurnal Bhinneka Tunggal Ika, 3(2), 125–134.

Mahfud, F. K. R., Mudawamah, N. S., & Hariyanto, W. (2020). Sentiment Analysis of Perpustakaan Nasional Republik Indonesia Through Social Media Twitter. Matics: Jurnal Ilmu Komputer Dan Teknologi Informasi, 12(1), 90–93. https://doi.org/10.18860/mat.v12i1.8973

Permendiknas RI No 17 Tahun 2010. (2010). Kementerian Pendidikan Nasional.

Pradnyana, G. A., & ER, N. A. S. (2012). PERANCANGAN DAN IMPLEMENTASI AUTOMATED DOCUMENT INTEGRATION DENGAN MENGGUNAKAN ALGORITMA COMPLETE LINKAGE AGGLOMERATIVE HIERARCHICAL CLUSTERING. Jurnal Ilmu Komputer, 5(2), 1–10.

Prianto, C., & Bunyamin, S. (2020). Pembuatan Aplikasi Clustering Gangguan Jaringan Menggunakan Metode K-Means Clustering (1st ed.). Bandung: Kreatif Industri Nusantara.

Prihantini, F. N., & Indudewi, D. (2016). Kesadaran dan Perilaku Plagiarisme dikalangan Mahasiswa (Studi pada Mahasiswa Fakultas Ekonomi Jurusan Akuntansi Universitas Semarang). Jurnal Dinamika Sosial Budaya, 18(1), 68–75.

Purwiantono, F. E., & Aditya, A. (2020). KLASIFIKASI SENTIMEN SARA, HOAKS DAN RADIKAL PADA POSTINGAN MEDIA SOSIAL MENGGUNAKAN ALGORITMA NAIVE BAYES MULTINOMIAL TEXT. Jurnal TeknoKompak, 14(2), 68–73.

Ramadhani, S. (2015). Sistem Pencegahan Plagiarism Tugas Akhir Menggunakan Algoritma Rabin-Karp ( Studi Kasus : Sekolah Tinggi Teknik Payakumbuh ). Jurnal Teknologi Informasi & Komunikasi Digital ZOne, 6(1), 44–52.

Ryan, G., Bonanno, H., Krass, I., Scouller, K., & Smith, L. (2009). Undergraduate and Postgraduate Pharmacy Students ’ Perceptions of Plagiarism and Academic Honesty. American Journal of Pharmaceutical Education, 73(6), 1–8.

Turney, P. D. (2010). From Frequency to Meaning : Vector Space Models of Semantics From Frequency to Meaning : Vector Space Models of Semantics. Journal of Artificial Intelligence Research2, 37(March 2010), 141–188. https://doi.org/10.1613/jair.2934

Wahyuni, R. T., Prastiyanto, D., & Supraptono, E. (2017). Penerapan Algoritma Cosine Similarity dan Pembobotan TF-IDF pada Sistem Klasifikasi Dokumen Skripsi. Jurnal Teknik Elektro, 9(1), 18–23.

Downloads


Crossmark Updates

How to Cite

Oppi Anda Resta, Aditya, A., & Febry Eka Purwiantono. (2021). Plagiarism Detection in Students’ Theses Using The Cosine Similarity Method. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 5(2), 305-313. https://doi.org/10.33395/sinkron.v5i2.10909