Efficient CNN-Based Classification of SARS-CoV-2 Spike Gene Sequences Using Alignment-Free Encoding
DOI:
10.33395/sinkron.v10i1.15691Keywords:
Convolutional Neural Network , Deep learning, Genom, Klasifikasi Varian, SARS-CoV-2Abstract
The COVID-19 pandemic caused by SARS-CoV-2 continues to challenge the global health system through the emergence of various variants with genetic characteristics that affect vaccine transmission and effectiveness. Conventional identification methods such as Whole-Genome Sequencing (WGS) have high accuracy but are constrained by significant cost and time. Most classification studies today still rely on complex hybrid architectures such as CNN-LSTM or image-based representations that increase computational load. This study aims to develop an efficient and lightweight pure Convolutional Neural Network model based on alignment-free encoding to classify five Variant of Concern (VOC) variants of SARS-CoV-2 (Alpha, Beta, Delta, Gamma, and Omicron) with an exclusive focus on the Spike gene sequence. The dataset consists of 5,000 Spike gene sequences that are represented using integer encoding and standardized with zero-padding. CNN proposed Lightweight architecture consists of four 1D convolution layers with a total of approximately 1.6 million parameters. The test results show that the model achieves excellent performance with an overall accuracy of 98.93%. The precision, recall, and F1-score values averaged 0.99, while the analysis of the ROC curve showed AUC values above 0.99 for all variants. This approach has proven to be efficient and effective, offering a fast, scalable, and resource-efficient solution to support real-time genomic surveillance systems in future pandemic mitigation.
Downloads
References
Andre, M., Lau, L. S., Pokharel, M. D., Ramelow, J., Owens, F., Souchak, J., Akkaoui, J., Ales, E., Brown, H., Shil, R., Nazaire, V., Manevski, M., Paul, N. P., Esteban-Lopez, M., Ceyhan, Y., & El-Hage, N. (2023). From Alpha to Omicron: How Different Variants of Concern of the SARS-Coronavirus-2 Impacted the World. Biology, 12(9). https://doi.org/10.3390/biology12091267
Awe, O. I., obura, hesborn omwandho, Mwanga, M. J., & Evans, M. (2023). Enhanced Deep Convolutional Neural Network for SARS-CoV-2 Variants Classification. BioRxiv, 2023–2028.
Awe, O. I., Obura, H., Ssemuyiga, C., Mudibo, E., & Mwanga, M. J. (2025). Enhanced deep Convolutional Neural Network for SARS-CoV-2 variants classification. September, 1–16. https://doi.org/10.3389/frai.2025.1512003
Azevedo, K. S., de Souza, L. C., Coutinho, M. G. F., de M. Barbosa, R., & Fernandes, M. A. C. (2024). Deepvirusclassifier: a deep learning tool for classifying SARS-CoV-2 based on viral subtypes within the coronaviridae family. BMC Bioinformatics, 25(1), 1–21. https://doi.org/10.1186/s12859-024-05754-1
Bezerra, G., Câmara, M., Prof, O., Augusto, M., & Fernandes, C. (2024). Advanced Convolutional Neural Network Techniques for Classification of SARS-CoV-2 Variants and Other Viruses : A Study Using k -mers and Chaos Game Representation.
Câmara, G. B. M., Coutinho, M. G. F., Silva, L. M. D. d., Gadelha, W. V. d. N., Torquato, M. F., Barbosa, R. de M., & Fernandes, M. A. C. (2022). Convolutional Neural Network Applied to SARS-CoV-2 Sequence Classification. Sensors, 22(15), 1–15. https://doi.org/10.3390/s22155730
Chourasia, P., Murad, T., Tayebi, Z., Ali, S., Khan, I. U., & Patterson, M. (2024). Efficient Classification of SARS-CoV-2 Spike Sequences Using Federated Learning. Communications in Computer and Information Science, 2142 CCIS, 80–96. https://doi.org/10.1007/978-3-031-63616-5_6
Coutinho, M. G. F., Câmara, G. B. M., Barbosa, R. de M., & Fernandes, M. A. C. (2023). SARS-CoV-2 virus classification based on stacked sparse autoencoder. Computational and Structural Biotechnology Journal, 21, 284–298. https://doi.org/10.1016/j.csbj.2022.12.007
de Souza, L. C., Azevedo, K. S., de Souza, J. G., Barbosa, R. de M., & Fernandes, M. A. C. (2023). New proposal of viral genome representation applied in the classification of SARS-CoV-2 with deep learning. BMC Bioinformatics, 24(1), 1–19. https://doi.org/10.1186/s12859-023-05188-1
Gadelha, W. V. N., Torquato, M. F., & Barbosa, R. D. M. (2022). Sequence Classification. 1–15.
Guerrero-Tamayo, A., Sanz Urquijo, B., Olivares, I., Moragues Tosantos, M. D., Casado, C., & Pastor-López, I. (2024). Classification of SARS-CoV-2 sequences as recombinants via a pre-trained CNN and identification of a mathematical signature relative to recombinant feature at Spike, via interpretability. PLoS ONE, 19(8), 1–27. https://doi.org/10.1371/journal.pone.0309391
Harikrishnan, N. B., Pranay, S. Y., & Nagaraj, N. (2022). Classification of SARS-CoV-2 viral genome sequences using Neurochaos Learning. Medical and Biological Engineering and Computing, 60(8), 2245–2255. https://doi.org/10.1007/s11517-022-02591-3
Hatami, P., Annan, R., Miranda, L. U., Gorman, J., Xie, M., Qingge, L., & Agricultural, N. C. (n.d.). 1,5* 1.
Kingma, D. P., & Ba, J. L. (2015). A : a m s o. 1–15.
Li, Z., Liu, F., Yang, W., Peng, S., & Zhou, J. (2022). A Survey of Convolutional Neural Network s: Analysis, Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems, 33(12), 6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827
Nerkar, V., & Kimbahune, V. (2024). Deep learning Approaches in Genomic Analysis : A Review of DNA Sequence Classification Techniques. 10(2), 439–445.
Nguyen, N. G., Tran, V. A., Ngo, D. L., & Phan, D. (2016). DNA Sequence Classification by Convolutional Neural Network . April, 280–286.
Ullah, W., Ullah, A., Malik, K. M., Saudagar, A. K. J., Khan, M. B., Hasanat, M. H. A., AlTameem, A., & AlKhathami, M. (2022). Multi-Stage Temporal Convolution Network for COVID-19 Variant Classification. Diagnostics, 12(11), 1–12. https://doi.org/10.3390/diagnostics12112736
Walz, W. (2023). Machine learning for Brain Disorders Series Editor.
Wang, H., Tsinda, E. K., Dunn, A. J., Chikweto, F., Ahmed, N., Pelosi, E., & Zemkoho, A. B. (2022). Deep learning forward and reverse primer design to detect SARS-CoV-2 emerging variants. http://arxiv.org/abs/2209.13591
Whata, A., & Chimedza, C. (2021). Deep learning for SARS COV-2 Genome Sequences. IEEE Access, 9, 59597–59611. https://doi.org/10.1109/ACCESS.2021.3073728
Zhao, X., Wang, L., Zhang, Y., Han, X., Deveci, M., & Parmar, M. (2024). A review of Convolutional Neural Network s in computer vision. In Artificial Intelligence Review (Vol. 57, Issue 4). Springer Netherlands. https://doi.org/10.1007/s10462-024-10721-6
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2026 Rengga Anggarah, Ernawati, Widhia KZ Oktoeberza

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit




















