Class Balancing Methods Comparison for Software Requirements Classification on Support Vector Machines


  • Fachrul Pralienka Bani Muhamad Politeknik Negeri Indramayu, Indonesia
  • Esti Mulyani Politeknik Negeri Indramayu, Indonesia
  • Munengsih Sari Bunga Politeknik Negeri Indramayu, Indonesia
  • Achmad Farhan Mushafa Politeknik Negeri Indramayu, Indonesia




Class balancing, classification, random over sampling, software requirements, support vector machine


Cost, time, and development effort can increase due to errors in analyzing functional and non-functional software requirements. To minimize these errors, previous research has tried to classify software requirements, especially non-functional requirements, on the PROMISE dataset using the Bag of Words (BoW) feature extraction and the Support Vector Machine (SVM) classification algorithm. On the other hand, the unbalanced distribution of class labels tends to decrease the evaluation result. Moreover, most software requirements are usually functional requirements. Therefore, there is a tendency for classifier models to classify test data as functional requirements. Previous research has performed class balancing on a dataset to handle unbalanced data. The study can achieve better classification evaluation results. Based on the previous research, this study proposes to combine the class balancing method and the SVM algorithm. K-fold cross-validation is used to optimize the training and test data to be more consistent in developing the SVM model. Tests were carried out on the value of K in k-fold, i.e., 5, 10, and 15. Results are measured by accuracy, f1-score, precision, and recall. The Public Requirements (PURE) dataset has been used in this research. Results show that SVM with class balancing can classify software requirements more accurately than SVM without class balancing. Random Over Sampling is the class balancing method with the highest evaluation score for classifying software requirements on SVM. The results showed an improvement in the average value of accuracy, f1 score, precision, and recall in SVM by 22.07%, 19.67%, 17.73%, and 19.67%, respectively.

GS Cited Analysis


Download data is not yet available.


Aminu Umar, M. (2020). Automated Requirements Engineering Framework for Agile Development. ICSEA 2020: The Fifteenth International Conference on Software Engineering Advances, c, 147–150.

Ao, S. I., Gelman, Len., Hukins, D. W. L., Hunter, Andrew., Korsunsky, Alexander., & International Association of Engineers. (n.d.). Balancing Class for Performance of Classification with a Clinical Dataset. 1538.

Binkhonain, M., & Zhao, L. (2019). A review of machine learning algorithms for identification and classification of non-functional requirements. Expert Systems with Applications: X, 1.

Canedo, E. D., & Mendes, B. C. (2020). Software Requirements Classification Using Machine Learning Algorithms. Entropy, 22(9), 1–20.

Chakraborty, J., Majumder, S., & Menzies, T. (2021). Bias in machine learning software: Why? how? what to do? ESEC/FSE 2021 - Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 429–440.

Dharma, A. S., & Saragih, Y. G. R. (2022). Comparison of Feature Extraction Methods on Sentiment Analysis in Hotel Reviews. Sinkron: Jurnal Dan Penelitian Teknik Informatika, 7(4), 2349–2354.

Ferrari, A., Spagnolo, G. O., & Gnesi, S. (2017). PURE: A Dataset of Public Requirements Documents. Proceedings - 2017 IEEE 25th International Requirements Engineering Conference, RE 2017, 502–505.

Gazali Mahmud, F., Iman Hermanto, T., Maruf Nugroho, I., & Tinggi Teknologi Wastukancana, S. (2023). IMPLEMENTATION OF K-NEAREST NEIGHBOR ALGORITHM WITH SMOTE FOR HOTEL REVIEWS SENTIMENT ANALYSIS. Sinkron: Jurnal Dan Penelitian Teknik Informatika, 8(2).

Hickman, L., Thapa, S., Tay, L., Cao, M., & Srinivasan, P. (2022). Text Preprocessing for Text Mining in Organizational Research: Review and Recommendations. Organizational Research Methods, 25(1), 114–146.

Khayashi, F., Jamasb, B., Akbari, R., & Shamsinejadbabaki, P. (2022). Deep Learning Methods for Software Requirement Classification: A Performance Study on the PURE dataset. ArXiv Preprint ArXiv:2211.05286.

Md. Ariful Haque, Md. Abdur Rahman, & Md Saeed Siddik. (2019). Non-Functional Requirements Classification withFeature Extraction and Machine Learning: AnEmpirical Study. 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT).

Mulyawan, M. D., Kumara, I. N. S., Bagus, I., Swamardika, A., & Saputra, K. O. (2021). Kualitas Sistem Informasi Berdasarkan ISO / IEC 25010 : 20(1).

Rahimi, N., Eassa, F., & Elrefaei, L. (2020). SS symmetry An Ensemble Machine Learning Technique for. Ml, 1–25.

Ramos, F., Costa, A., Perkusich, M., Almeida, H., & Perkusich, A. (2018). A non-functional requirements recommendation system for scrum-based projects. Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE, 2018-July(July), 149–154.

Shreda, Q. A., & Hanani, A. A. (2021). Identifying Non-functional Requirements from Unconstrained Documents using Natural Language Processing and Machine Learning Approaches. IEEE Access, 4, 1–22.

Susan, S., & Kumar, A. (2021). The balancing trick: Optimized sampling of imbalanced datasets—A brief survey of the recent State of the Art. Engineering Reports, 3(4).

Tiun, S., Mokhtar, U. A., Bakar, S. H., & Saad, S. (2020). Classification of functional and non-functional requirement in software requirement using Word2vec and fast Text. Journal of Physics: Conference Series, 1529(4).

Vogelsang, A., & Borg, M. (2019). Requirements Engineering for Machine Learning: Perspectives from Data Scientists. IEEE 27th International Requirements Engineering Conference Workshops (REW), 245–251.

Yanmin Yang, Xin Xia, David Lo, & John Grundy. (2020). A Survey on Deep Learning for Software Engineering. ACM Computing Survey, 1(1), 1–35.


Crossmark Updates

How to Cite

Muhamad, F. P. B. ., Mulyani, E., Bunga, M. S., & Mushafa, A. F. (2023). Class Balancing Methods Comparison for Software Requirements Classification on Support Vector Machines. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 7(2), 1196-1208.