Development of Machine Learning Model for Breast Cancer Prediction from Ultrasound Images





Data Analysis, Early Detection, Breast Cancer, Machine Learning, Support Vector Machine


In the past decade, the revolution in information and computing technology has transformed approaches to breast cancer detection and treatment, with Machine Learning technologies offering significant potential in health data analysis. However, the development of accurate and reliable predictive models is faced with the challenges of data heterogeneity and complexity. This research proposes the development and validation of Machine Learning-based classification models using Support Vector Machine and Principal Component Analysis to address these issues, targeting improved accuracy in the early detection of breast cancer. The methodology applied involved the use of a breast cancer dataset from Kaggle, with data analysis conducted through inductive methods to identify relevant patterns. The combination of Support Vector Machine and Principal component Analysis achieved 89% accuracy in medical image classification, proving its efficacy in breast cancer diagnostics and providing a more reliable model for early detection. The implications of these findings are significant, both theoretically and practically, for the fields of Machine Learning and Breast Cancer, expanding the understanding of the applications of advanced data processing techniques. Although this study faces limitations in the variability of the dataset's patient characteristics, the results offer a basis for further development in diagnostic technology while recommending the integration of Deep Learning and Big Data analysis as a direction for future research.

GS Cited Analysis


Download data is not yet available.


Al-dhabyani, W., Gomaa, M., Khaled, H., & Fahmy, A. (2020). Dataset of breast ultrasound images. Data in Brief, 28, 1–5.

Duan, H., Zhang, Y., Qiu, H., Fu, X., Liu, C., Zang, X., Xu, A., Wu, Z., Li, X., Zhang, Q., Zhang, Z., & Cui, F. (2024). Machine learning-based prediction model for distant metastasis of breast cancer. Computers in Biology and Medicine, 169(December 2023).

Hassan, M., Hassan, M., Yasmin, F., & Rakib, A. (2023). A comparative assessment of machine learning algorithms with the Least Absolute Shrinkage and Selection Operator for breast cancer detection and prediction. Decision Analytics Journal, 7(February).

Hindarto, D., Afarini, N., Informatika, P., Informasi, P. S., & Luhur, U. B. (2023). COMPARISON EFFICACY OF VGG16 AND VGG19 INSECT CLASSIFICATION. 6(3), 189–195.

Hindarto, D., & Djajadi, A. (2023). Android-manifest extraction and labeling method for malware compilation and dataset creation. 13(6), 6568–6577.

Hindarto, D., & Santoso, H. (2023). PyTorch Deep Learning for Food Image Classification with Food Dataset. 8(4), 2651–2661.

Kumar, A., Kr, S., Mandal, A., & Bhattacharya, A. (2024). Machine Learning based Intelligent System for Breast Cancer Prediction (MLISBCP). Expert Systems With Applications, 242(May 2023), 0–1.

Lee, I. G., Zhang, Q., Yoon, S. W., & Won, D. (2020). A mixed integer linear programming support vector machine for cost-effective feature selection. Knowledge-Based Systems, 203, 106145.

Meena, G., Mohbey, K. K., Indian, A., & Kumar, S. (2022). Sentiment Analysis from Images using VGG19 based Transfer Learning Approach. 00(2021).

Munim, Z. H., Fiskin, C. S., Nepal, B., & Chowdhury, M. M. H. (2023). Forecasting container throughput of major Asian ports using the Prophet and hybrid time series models. Asian Journal of Shipping and Logistics, xxxx.

Pawłowska, A., Ćwierz-pieńkowska, A., Domalik, A., Jaguś, D., Kasprzak, P., Matkowski, R., Fura, Ł., Nowicki, A., & Żołek, N. (2024). Curated benchmark dataset for ultrasound based breast lesion analysis. Scientific Data, 1–13.

Qian, L., Bai, J., Huang, Y., Qader, D., & Saffari, A. (2024). Biomedical Signal Processing and Control Breast cancer diagnosis using evolving deep convolutional neural network based on hybrid extreme learning machine technique and improved chimp optimization algorithm. Biomedical Signal Processing and Control, 87(June 2023).

Ryu, J., Heisig, S., McLaughlin, C., Katz, M., Mayberg, H. S., & Gu, X. (2023). A natural language processing approach reveals first-person pronoun usage and non-fluency as markers of therapeutic alliance in psychotherapy. IScience, 26(6), 106860.

Shayea, I., Saoud, B., & Hadri, M. (2024). Machine learning , IoT and 5G technologies for breast cancer studies : A review. Alexandria Engineering Journal, 89(October 2023), 210–223.

Sultan, G., & Zubair, S. (2024). An ensemble of bioinformatics and machine learning approaches to identify shared breast cancer biomarkers among diverse populations. Computational Biology and Chemistry, 108(December 2023).

Yadav, R. K., Singh, P., & Kashtriya, P. (2023). Diagnosis of Breast Cancer using Machine Learning Techniques -A Survey. ScienceDirect, 1–10.

Zhu, X., Blanco, E., Bhatti, M., & Borrion, A. (2020). Leguminous seeds detection based on convolutional neural networks: Comparison of faster R-CNN and YOLOv4 on a small custom dataset. Science of the Total Environment, 143747.


Crossmark Updates

How to Cite

Hindarto, D., & Hendrata, F. . (2024). Development of Machine Learning Model for Breast Cancer Prediction from Ultrasound Images. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 8(2), 1019-1028.

Most read articles by the same author(s)

1 2 3 4 > >>