Predicting Employee Attrition Using Logistic Regression With Feature Selection

Authors

  • Fitri Herinda Wardhani Telkom University, Indonesia
  • Kemas Muslim Lhaksmana Telkom University, Indonesia

DOI:

10.33395/sinkron.v7i4.11783

Keywords:

employee, attrition, logistic regression, classification

Abstract

Employee attrition is a reduction in employees that happens gradually. Employee attrition can damage the organization of a company, including the projects and its employee structure. This study aims to predict employee attrition in a company using the logistic regression method. Employee attrition can be predicted using machine learning because the machine learning approach is not biased due to human interference. In addition, human resources in a company need to know the most influential factors that cause the occurrence of employee attrition. In this study, we proposed feature selection methods to identify those influential factors and simplify the data training. Our approach is to predict employee attrition with three kinds of feature selection methods, namely information gain, select k-best, and recursive feature elimination (RFE). The 10-fold cross-validation was performed as an evaluation method. Prediction of employee attrition using the logistic regression method without applying feature selection gets an accuracy value of 0.865 and an AUC score of 0.932. However, by applying the RFE feature selection showed the highest evaluation result than information gain and select k-best, with an accuracy value of 0.853 and an AUC score of 0.925

GS Cited Analysis

Downloads

Download data is not yet available.

References

Alduayj, S. S., & Rajpoot, K. (2019). Predicting Employee Attrition using Machine Learning. Proceedings of the 2018 13th International Conference on Innovations in Information Technology, IIT 2018, 93–98. https://doi.org/10.1109/INNOVATIONS.2018.8605976

Aswan Supriyadi Sunge. (2018). Prediksi Kompetensi Karyawan Menggunakan Algoritma C4 . 5 ( Studi Kasus : PT Hankook Tire Indonesia ). Seminar Nasional Teknologi Informasi Dan Komunikasi 2018 (SENTIKA 2018), 2018(Sentika), 23–24.

Atul Chanodkar, Ravi Changle, D. M. (2019). Prediction of Employee Turnover in Organizations Using Machine Learning Algorithms.

Cai, J., Luo, J., Wang, S., & Yang, S. (2018). Feature selection in machine learning: A new perspective. Neurocomputing, 300, 70–79. https://doi.org/10.1016/j.neucom.2017.11.077

Fallucchi, F., Coladangelo, M., Giuliano, R., & De Luca, E. W. (2020). Predicting employee attrition using machine learning techniques. Computers, 9(4), 1–17. https://doi.org/10.3390/computers9040086

Ghazi, A. H., Elsayed, S. I., & Khedr, A. E. (2021). A proposed model for predicting employee turnover of information technology specialists using data mining techniques. International Journal of Electrical and Computer Engineering Systems, 12(2), 113–121. https://doi.org/10.32985/IJECES.12.2.6

Harvida, D. A., & Wijaya, C. (2020). Faktor Yang Mempengaruhi Turnover Karyawan dan Strategi Retensi Sebagai Pencegahan Turnover Karyawan : Sebuah Tinjauan Literatur. Jurnal Ilmu Administrasi Negara, 16(1), 13–23.

Hasibuan, M. R., & Marji. (2019). Pemilihan Fitur dengan Information Gain untuk Klasifikasi Penyakit Gagal Ginjal menggunakan Metode Modified K-Nearest Neighbor ( MKNN ). 3(11), 10435–10443.

Indrawati, A. (2021). Penerapan Teknik Kombinasi Oversampling Dan Undersampling Untuk Mengatasi Permasalahan Imbalanced Dataset. JIKO (Jurnal Informatika Dan Komputer), 4(1), 38–43. https://doi.org/10.33387/jiko.v4i1.2561

Julianto, I. T., Kurniadi, D., Nashrulloh, M. R., & Mulyani, A. (2022). Comparison of Data Mining Algorithm For Forecasting Bitccoin Crypto Currency Trends. Jurnal Teknik Informatika (Jutif), 3(2), 245–248. https://doi.org/10.20884/1.jutif.2022.3.2.194

Khera, S. N., & Divya. (2019). Predictive Modelling of Employee Turnover in Indian IT Industry Using Machine Learning Techniques. Vision, 23(1), 12–21. https://doi.org/10.1177/0972262918821221

Paelongan, P. L., Palupi, I., Si, S., & Si, M. (2018). Desain Model Prediksi Penderita Kanker Paru-Paru menggunakan Regresi Logistik Linier. 3–15.

Pal, K., & Patel, B. V. (2020). Data Classification with k-fold Cross Validation and Holdout Accuracy Estimation Methods with 5 Different Machine Learning Techniques. Proceedings of the 4th International Conference on Computing Methodologies and Communication, ICCMC 2020, Iccmc, 83–87. https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00016

Pavansubhash. (2017). IBM HR Analytics Employee Attrition & Performance. Kaggle. https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset?select=WA_Fn-UseC_-HR-Employee-Attrition.csv

Ponnuru, S. R. (2020). Employee Attrition Prediction using Logistic Regression. International Journal for Research in Applied Science and Engineering Technology, 8(5), 2871–2875. https://doi.org/10.22214/ijraset.2020.5481

Rahmansyah, A., Dewi, O., Andini, P., Hastuti, T., Ningrum, P., & Suryana, M. E. (2018). Membandingkan Pengaruh Feature Selection Terhadap Algoritma Naïve Bayes dan Support Vector Machine. Seminar Nasional Aplikasi Teknologi Informasi (SNATi), 1907–5022.

Ramadhy, I. F., & Sibaroni, Y. (2022). Analisis Trending Topik Twitter dengan Fitur Ekspansi FastText Menggunakan Metode Logistic Regression. Jurnal Riset Komputer), 9(1), 2407–389.

Sisodia, D. S., Vishwakarma, S., & Pujahari, A. (2018). Evaluation of machine learning models for employee churn prediction. Proceedings of the International Conference on Inventive Computing and Informatics, ICICI 2017, Icici, 1016–1020. https://doi.org/10.1109/ICICI.2017.8365293

Yiǧit, I. O., & Shourabizadeh, H. (2017). An approach for predicting employee churn by using data mining. IDAP 2017 - International Artificial Intelligence and Data Processing Symposium. https://doi.org/10.1109/IDAP.2017.8090324

Downloads


Crossmark Updates

How to Cite

Wardhani, F. H. ., & Lhaksmana, K. M. . (2022). Predicting Employee Attrition Using Logistic Regression With Feature Selection. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 6(4), 2214-2222. https://doi.org/10.33395/sinkron.v7i4.11783