SVM-Based Pediatric Disease Classification Model from the Balinese Lontar Usada Rare Manuscript

Authors

  • I Gusti Made Ngurah Ari Bhawanaputra )Fakultas Teknologi dan Informatika, Program Studi Teknik Informatika, Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • I Gede Iwan Sudipa Fakultas Teknologi dan Informatika, Program Studi Teknik Informatika, Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • Ni Putu Suci Meinarni Fakultas Teknologi dan Informatika, Program Studi Teknik Informatika, Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • I Gusti Ayu Agung Mas Aristamy Fakultas Teknologi dan Informatika, Program Studi Teknik Informatika, Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia
  • Indra Pratistha Fakultas Teknologi dan Informatika, Program Studi Teknik Informatika, Institut Bisnis dan Teknologi Indonesia, Bali, Indonesia

DOI:

10.33395/sinkron.v10i1.15508

Keywords:

Classification, Lontar Usada Rare, Balinese Traditional Medicine, Support Vector Machine, TF-IDF

Abstract

Lontar Usada Rare is a traditional Balinese manuscript containing pediatric medical knowledge based on local wisdom, yet its narrative format limits accessibility and utilization in modern contexts, while its physical fragility threatens long-term preservation. This study aims to develop a pediatric disease classification model using a Support Vector Machine (SVM) combined with Term Frequency–Inverse Document Frequency (TF-IDF) weighting to support the digitalization of Balinese traditional medicine. A total of 422 data samples were collected through expert interviews and manuscript analysis, covering symptoms, disease types, herbal ingredients, and treatment procedures. The research stages included text preprocessing (cleansing, tokenizing, stopword removal, stemming), manual labeling into 35 disease classes, and model evaluation using five train–test split ratios (80:20 to 60:40) with variations of the complexity parameter C (0.5, 1, 10, 100, 1000). The best performance was achieved using C=10 with an 80:20 ratio, resulting in 87.06% accuracy, 91.55% precision, 87.06% recall, and an F1-score of 87.96%. Confusion matrix analysis showed strong classification performance for most classes, although minority classes with overlapping symptoms exhibited misclassification. Overall, the TF-IDF and linear SVM combination effectively classifies pediatric disease symptoms from Lontar Usada Rare and contributes to the preservation and digital transformation of Balinese traditional medical knowledge for potential modern healthcare applications.

 

GS Cited Analysis

Downloads

Download data is not yet available.

References

Abdalla, H. B. (2022). A Brief Survey On Big Data: Technologies, Terminologies And Data-Intensive Applications. Journal Of Big Data, 9(1). Https://Doi.Org/10.1186/S40537-022-00659-3

Adnyana, P. E. S. (2020). Lontar Usada Rare : Memahami Kearifan Lokal Tradisional Bali Dalam Mendiagnosa Gejala Penyakit Anak. 3(2), 163–173.

Adnyana, P. E. S. (2021). Empirisme Penggunaan Tumbuhan Pada Pengobatan Tradisional Bali: Lontar Taru Pramana Dalam Konstruksi Filsafat Ilmu. Sanjiwani: Jurnal Filsafat, 12(1), 64. Https://Doi.Org/10.25078/Sjf.V12i1.2059

Amaya-Tejera, N., Gamarra, M., Vélez, J. I., & Zurek, E. (2024). A Distance-Based Kernel For Classification Via Support Vector Machines. Frontiers In Artificial Intelligence, 7. Https://Doi.Org/10.3389/Frai.2024.1287875

Balipost.Com. (2024). Disbud Badung Lestarikan Lontar Dengan Digitalisasi. Https://Www.Balipost.Com/News/2024/02/08/387044/Disbud-Badung-Lestarikan-Lontar-Dengan...Html

Brin.Go.Id. (2025). Inovasi Digital Selamatkan Warisan Budaya Dan Bahasa Daerah. Https://Www.Brin.Go.Id/News/122398/Inovasi-Digital-Selamatkan-Warisan-Budaya-Dan-Bahasa-Daerah

Chen, W., Gong, Y., Xu, C., Hu, H., Yao, B., Wei, Z., Fan, Z., Hu, X., Zhou, B., Cheng, B., Jiang, D., & Duan, N. (2022). Contextual Fine-To-Coarse Distillation For Coarse-Grained Response Selection In Open-Domain Conversations. Proceedings Of The Annual Meeting Of The Association For Computational Linguistics, 1, 4865–4877. Https://Doi.Org/10.18653/V1/2022.Acl-Long.334

Dag, H. (2020). The Impact Of Text Preprocessing On The Prediction Of Review Ratings. May. Https://Doi.Org/10.3906/Elk-1907-46

Daniel, U. (2022). Klasifikasi Tanaman Hias Berdasarkan Tekstur Daun Menggunakan Metode Svm Dan Fitur Glcm. 3(2), 121–127.

Das, S., Pandit, R., & Naskar, S. K. (2020). A Rule Based Lightweight Bengali Stemmer. 400–408.

Dewanti, T. R., Prathivi, R., & Susanto. (2025). Implementasi Metode Svm Untuk Klasifikasi Penyakit Stunting Bayi. 101–106.

Fan, Q., Liu, S., Zhao, C., & Li, S. (2023). An Instance- And Label-Based Feature Selection Method In Classification Tasks. Information (Switzerland), 14(10), 1–14. Https://Doi.Org/10.3390/Info14100532

Gastaldi, J. L., Terilla, J., Malagutti, L., Dusell, B., Vieira, T., & Cotterell, R. (2025). The Foundations Of Tokenization: Statistical And Computational Concerns. 1–18. Http://Arxiv.Org/Abs/2407.11606

Goodnewsfromindonesia.Id. (2025). Digitalisasi Lontar Bali Sebagai Upaya Menjaga Warisan Leluhur. Https://Www.Goodnewsfromindonesia.Id/2021/07/19/Digitalisasi-Lontar-Bali-Sebagai-Upaya-Menjaga-Warisan-Leluhur

Grandini, M., Bagli, E., & Visani, G. (2020). Metrics For Multi-Class Classification: An Overview. 1–17. Http://Arxiv.Org/Abs/2008.05756

Guido, R., Ferrisi, S., Lofaro, D., & Conforti, D. (2024). An Overview On The Advancements Of Support Vector Machine Models In Healthcare Applications: A Review. Information (Switzerland), 15(4). Https://Doi.Org/10.3390/Info15040235

Hartono, E. F., Rachmat, N., Multi, U., Palembang, D., & Informatika, J. (2022). Klasifikasi Jenis Plastik Hdpe , Ldpe , Dan Ps Berdasarkan Tekstur Menggunakan Metode Support Vector Machine. 9(2), 1403–1412.

Hidayat, S., Napitupulu, H., & Gusriani, N. (2024). Penerapan Model Support Vector Machine Pada Kasus Klasifikasi Teks Berdasarkan Tujuan Sdgs Ke Tiga, Empat, Dan Enam. 6(2), 28–37.

Kunilovskaya, M. (2021). Text Preprocessing And Its Implications In A Digital Humanities Project. 85–93.

Nguyen, Q. H., Ly, H. B., Ho, L. S., Al-Ansari, N., Van Le, H., Tran, V. Q., Prakash, I., & Pham, B. T. (2021). Influence Of Data Splitting On Performance Of Machine Learning Models In Prediction Of Shear Strength Of Soil. Mathematical Problems In Engineering, 2021. Https://Doi.Org/10.1155/2021/4832864

Peraturan Menteri Kesehatan Nomor 37 Tahun 2017 Tentang Pelayanan Kesehatan Tradisional Integrasi, Pub. L. No. 1109/Menkes/Per/Ix/2007 (2017).

Pradana, A. W., & Hayaty, M. (2019). The Effect Of Stemming And Removal Of Stopwords On The Accuracy Of Sentiment Analysis On Indonesian-Language Texts. 4(3).

Putra Asana, I. M. D., & Della Tirta Yanti, N. P. (2023). Sistem Klasifikasi Pengajuan Kredit Dengan Metode Support Vector Machine ( Svm ). 06(02), 123–133.

Rácz, A., Bajusz, D., & Héberger, K. (2021). Effect Of Dataset Size And Train/Test Split Ratios In Qsar/Qspr Multiclass Classification. Molecules, 26(4), 1–16. Https://Doi.Org/10.3390/Molecules26041111

Rahayu, S., & Yamasari, Y. (2024). Klasifikasi Penyakit Stroke Dengan Metode Support Vector Machine (Svm). Journal Of Informatics And Computer Science (Jinacs), 5(03), 440–446. Https://Doi.Org/10.26740/Jinacs.V5n03.P440-446

Ropikoh, I. A., Abdulhakim, R., Enri, U., & Sulistiyowati, N. (2021). Penerapan Algoritma Support Vector Machine ( Svm ) Untuk Klasifikasi Berita Hoax Covid-19. 5(1).

Saputra, N. A., Aeni, K., & Saraswati, N. M. (2024). Indonesian Hate Speech Text Classification Using Improved K-Nearest Neighbor With Tf-Idf- Icsρf. 11(1), 21–30. Https://Doi.Org/10.15294/Sji.V11i1.48085

Sarica, I. S., & Luo, J. (2021). Stopwords In Technical Language Processing. 1–13. Https://Doi.Org/10.1371/Journal.Pone.0254937

Sathyanarayanan, S., & Roopashri Tantri, B. (2024). Confusion Matrix-Based Performance Evaluation Metrics. African Journal Of Biomedical Research, 27(4), 4023–4031. Https://Doi.Org/10.53555/Ajbr.V27i4s.4345

Sheridan, P., Ahmed, Z., & Farooque, A. A. (2025). A Fisher’s Exact Test Justification Of The Tf–Idf Term-Weighting Scheme. The American Statistician, 1–11. Https://Doi.Org/10.1080/00031305.2025.2539241

Uptd Gedong Kirtya. (2016). Lontar Usada Rare.

Uu Nomor 5 Tahun 2017 Tentang Pemajuan Kebudayaan, Pub. L. No. Uu Nomor 5 Tahun 2017 (2017).

Verdikha, N. A., & Yulianto, F. (2025). Klasifikasi Ulasan Aplikasi Sirekap 2024 Dengan Ekstraksi Fitur Word2vec Dan Metode Support Vector Machine ( Svm ). 9(2), 3013–3019.

Wabula, Y., Latief, A. D., & Zainuddin, H. (2023). Next Sentence Prediction : The Impact Of Preprocessing Techniques In Deep Learning. 2023 International Conference On Computer, Control, Informatics And Its Applications (Ic3ina), October, 274–278. Https://Doi.Org/10.1109/Ic3ina60834.2023.10285805

Downloads


Crossmark Updates

How to Cite

Bhawanaputra, I. G. M. N. A., Sudipa, I. G. I., Meinarni, N. P. S. ., Aristamy, I. G. A. A. M. ., & Pratistha, I. . (2026). SVM-Based Pediatric Disease Classification Model from the Balinese Lontar Usada Rare Manuscript. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 10(1), 481-494. https://doi.org/10.33395/sinkron.v10i1.15508

Most read articles by the same author(s)

1 2 > >>