A Systematic Literature Review of Machine Learning for Endurance Running Performance Prediction
DOI:
10.33395/sinkron.v10i1.15743Keywords:
Machine learning, prediction of running time, running performance, sports analytics, systematic literature reviewAbstract
This study systematically reviews the application of machine learning methods for predicting running performance, with particular emphasis on short-middle distance events such as the 5 km. Although machine learning based performance prediction has been widely explored in endurance sports, a comprehensive review synthesizing models, predictors, and pipelines across running distances remains limited. The review followed the PRISMA 2020 framework. Articles published between 2020 and 2025 were retrieved from ScienceDirect, Google Scholar, and PubMed using predefined keyword combinations related to machine learning and running performance. Studies were included if they focused on running (excluding cycling, triathlon, or other sports), applied predictive modeling, and reported model evaluation metrics. A total of 26 studies met the inclusion criteria and were assessed using quality appraisal criteria inspired by TRIPOD and QUADAS-2. The analysis identified four main research themes: (1) application of machine learning models for running performance prediction, (2) physiological and anthropometric predictors, (3) non-physiological and contextual factors, and (4) personalized athlete training and monitoring. Ensemble learning models (Random Forest, XGBoost, LightGBM) consistently outperformed traditional linear regression by capturing non-linear interactions, while deep learning approaches (LSTM, GRU) demonstrated strong capability in modeling temporal training dynamics. A generalized machine learning pipeline for running performance prediction was also synthesized. This review contributes a structured framework that integrates modeling approaches, predictor categories, and evaluation strategies, and highlights research opportunities for explainable and personalized prediction systems, particularly for 5 km running performance.
Downloads
References
Alvero-Cruz, J. R., Carnero, E. A., García, M. A. G., Cárceles, F. A., Correas-Gómez, L., Rosemann, T., Nikolaidis, P. T., & Knechtle, B. (2020). Predictive performance models in long-distance runners: A narrative review. In International Journal of Environmental Research and Public Health (Vol. 17, Issue 21, pp. 1–22). MDPI. https://doi.org/10.3390/ijerph17218289
Azzahra, F. S. P., & Prianto, C. (2025). Tinjauan Literatur Sistematis: Analisis Implementasi Kecerdasan Buatan untuk Verifikasi Dokumen. Jurnal Ilmiah Sistem Informasi, 4(3), 417–430. https://doi.org/10.51903/kjjwk708
Casado, A., Hanley, B., Santos-Concejero, J., & Ruiz-P ´ Erez, L. M. (2019). World-Class Long-Distance Running Performances Are Best Predicted by Volume of Easy Runs and Deliberate Practice of Short-Interval and Tempo Runs. www.nsca.com
Collins, G. S., Reitsma, J. B., Altman, D. G., & Moons, K. G. M. (2015). Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. BMC Medicine, 13(1). https://doi.org/10.1186/s12916-014-0241-z
Coquart, J. B. (2023). Prediction of performance in a 100-km run from a simple equation. PLoS ONE, 18(3 March). https://doi.org/10.1371/journal.pone.0279662
Dash, S. (2024a). Win Your Race Goal: A Generalized Approach to Prediction of Running Performance. Sports Medicine International Open, 08(CP). https://doi.org/10.1055/a-2401-6234
Dash, S. (2024b). Win Your Race Goal: A Generalized Approach to Prediction of Running Performance. Sports Medicine International Open, 08(CP). https://doi.org/10.1055/a-2401-6234
Feely, C., Caulfield, B., Lawlor, A., & Smyth, B. (2023). Modelling the Training Practices of Recreational Marathon Runners to Make Personalised Training Recommendations. UMAP 2023 - Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, 183–193. https://doi.org/10.1145/3565472.3592952
Figueiredo, D. H., Figueiredo, D. H., Manoel, F. de A., & Machado, F. A. (2021). Peak Running Velocity or Critical Speed Under Field Conditions: Which Best Predicts 5-km Running Performance in Recreational Runners? Frontiers in Physiology, 12. https://doi.org/10.3389/fphys.2021.680790
Fokkema, T., van Damme, A. A. D. N., Fornerod, M. W. J., de Vos, R. J., Bierma-Zeinstra, S. M. A., & van Middelkoop, M. (2020). Training for a (half-)marathon: Training volume and longest endurance run related to performance and running injuries. Scandinavian Journal of Medicine and Science in Sports, 30(9), 1692–1704. https://doi.org/10.1111/sms.13725
Garmin Indonesia. (2024). Garmin Ungkap Tren Kebugaran di Tahun 2024: Lari, Tenis dan Golf Jadi Olahraga Paling Digemari di Indonesia. Https://Www.Garmin.Co.Id/News/Press-Release/News-2024-Dec-Garmin-Connect-Data-Report/.
Indrisari, E., Febiansyah, H., & Adiwinoto, B. (2025). A Systematic Literature Review on the Application of Machine Learning for Predicting Stunting Prevalence in Indonesia (2020–2024). Jurnal Sisfokom (Sistem Informasi Dan Komputer), 14(3), 277–283. https://doi.org/10.32736/sisfokom.v14i3.2366
Johansson, M., Atterfors, J., & Lamm, J. (2023). Pacing Patterns of Half-Marathon Runners: An analysis of ten years of results from Gothenburg Half Marathon. International Journal of Computer Science in Sport, 22(1), 124–138. https://doi.org/10.2478/ijcss-2023-0014
Keogh, A., O’connor Sheridan, O., Mccaffrey, O., Dunne, S., Lally, A., & Doherty, C. (2020). The Determinants of Marathon Performance: An Observational Analysis of Anthropometric, Pre-race and In-race Variables. http://www.intjexersci.com
Knechtle, B., Weiss, K., Valero, D., Villiger, E., Nikolaidis, P. T., Andrade, M. S., Scheer, V., Cuk, I., Gajda, R., & Thuany, M. (2024). Using machine learning to determine the nationalities of the fastest 100-mile ultra-marathoners and identify top racing events. PLoS ONE, 19(8). https://doi.org/10.1371/journal.pone.0303960
Knopp, M., Guppy, F., Joyner, M., Muniz-Pardos, B., Wackerhage, H., Schönfelder, M., Pitsiladis, Y., & Ruiz, D. (2025). Validation of marathon performance model based on physiological factors in world-class East African runners: a case series. Translational Exercise Biomedicine, 2(1), 1–8. https://doi.org/10.1515/teb-2024-0016
Lerebourg, L., Saboul, D., Clémençon, M., & Coquart, J. B. (2022). Prediction of Marathon Performance using Artificial Intelligence. International Journal of Sports Medicine, 44(5), 352–360. https://doi.org/10.1055/a-1993-2371
Mantzios, K., Ioannou, L. G., Panagiotaki, Z. O. E., Ziaka, S., Périard, J. D., Racinais, S., Nybo, L., & Flouris, A. D. (2022). Effects of Weather Parameters on Endurance Running Performance: Discipline-specific Analysis of 1258 Races. Medicine and Science in Sports and Exercise, 54(1), 153–161. https://doi.org/10.1249/MSS.0000000000002769
Muñoz-Pérez, I., Castañeda-Babarro, A., Santisteban, A., & Varela-Sanz, A. (2024). Predictive performance models in marathon based on half-marathon, age group and pacing behavior. Sport Sciences for Health, 20(3), 797–810. https://doi.org/10.1007/s11332-023-01159-4
Nikolaidis, P. T., & Knechtle, B. (2023). Predictors of Half-Marathon Performance in Male Recreational Athletes. EXCLI Journal, 22, 559–566. https://doi.org/10.17179/excli2023-6198
Nikolaidis, P. T., Rosemann, T., & Knechtle, B. (2021). Development and Validation of Prediction Equation of “Athens Authentic Marathon” Men’s Race Speed. Frontiers in Physiology, 12. https://doi.org/10.3389/fphys.2021.682359
Olaya-Cuartero, J., Pueo, B., Villalon-Gasch, L., & Jiménez-Olmedo, J. M. (2023). Prediction of Half-Marathon Power Target using the 9/3-Minute Running Critical Power Test. Journal of Sports Science and Medicine, 22(3), 525–530. https://doi.org/10.52082/jssm.2023.525
Pirscoveanu, C. I., & Oliveira, A. S. (2024). Prediction of instantaneous perceived effort during outdoor running using accelerometry and machine learning. European Journal of Applied Physiology, 124(3), 963–973. https://doi.org/10.1007/s00421-023-05322-0
Quittmann, O. J., Foitschik, T., Vafa, R., Freitag, F. J., Sparmann, N., Nolte, S., & Abel, T. (2021). Is Maximal Lactate Accumulation Rate Promising for Improving 5000-m Prediction in Running? International Journal of Sports Medicine, 44(4), 268–279. https://doi.org/10.1055/a-1958-3876
Rothschild, J. A., Stewart, T., Kilding, A. E., & Plews, D. J. (2024). Predicting daily recovery during long-term endurance training using machine learning analysis. European Journal of Applied Physiology, 124(11), 3279–3290. https://doi.org/10.1007/s00421-024-05530-2
Thuany, M., Weiss, K., Valero, D., Villiger, E., Andrade, M. S., Nikolaidis, P. T., Scheer, V., de Lira, C. A. B., Vancini, R. L., Cuk, I., Braschler, L., Rosemann, T., & Knechtle, B. (2025). An analysis of the 6-h ultra-marathon race using a machine learning approach. Frontiers in Sports and Active Living, 7. https://doi.org/10.3389/fspor.2025.1577470
Tomaszewski, M., Lukanova-Jakubowska, A., Majorczyk, E., & Dzierżanowski, Ł. (2024). From data to decision: Machine learning determination of aerobic and anaerobic thresholds in athletes. PLoS ONE, 19(8). https://doi.org/10.1371/journal.pone.0309427
Vijay, S. A., Sivakumar, C., Kumar, P. V., Muralidharan, C. K., Rajkumar, K. V., Kannan, K. R., Pradeepa, M., Sivasankar, P., Mariam, A. A., & Anand, U. K. A. (2024). Lactate Threshold Training to Improve Long-Distance Running Performance: A Narrative Review. Montenegrin Journal of Sports Science and Medicine, 20(1), 19–29. https://doi.org/10.26773/mjssm.240303
Vos, L., Vergeer, R., Goulding, R., Weide, G., de Koning, J., Jaspers, R., & Zwaard, S. van der. (2024). Predicting physical performance after training: insights from machine learning using small samples. https://doi.org/10.21203/rs.3.rs-4707433/v1
Weiss, K., Valero, D., Villiger, E., Scheer, V., Thuany, M., Aidar, F. J., de Souza, R. F., Cuk, I., Nikolaidis, P. T., Rosemann, T., & Knechtle, B. (2024). Associations between environmental factors and running performance: An observational study of the Berlin Marathon. PLoS ONE, 19(10). https://doi.org/10.1371/journal.pone.0312097
Westwood, M., Deeks, J. J., Whiting, P. F., Weswood, M. E., Rutjes, A. W., Reitsma, J. B., Bossuyt, P. N., & Kleijnen, J. (2011). QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. Article in Annals of Internal Medicine. https://doi.org/10.1059/0003-4819-155-8-201110180-00009
Wiecha, S., Kasiak, P. S., Cieśliński, I., Maciejczyk, M., Mamcarz, A., & Śliż, D. (2022). Modeling Physiological Predictors of Running Velocity for Endurance Athletes. Journal of Clinical Medicine, 11(22). https://doi.org/10.3390/jcm11226688
Winandy Soesilo, F., & Suprapti Hendharta, C. (2025). Factors Affecting Runners’ Continuance Intention on Strava Running Apps Usage In Jakarta. In Jurnal Bisnis & Komunikasi (Vol. 12, Issue 3). https://doi.org/https://ojs.kalbis.ac.id/index.php/kalbisocio/en/article/download/4574/1161/14635
Zhou, S. (2025). DeepRace: GRU-Based Sequence Modeling Framework for Marathon Performance Prediction. Informatica (Slovenia), 49(28), 153–160. https://doi.org/10.31449/inf.v49i28.8818
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2026 Efraim William Solang, Linawati, Ida Bagus Gede Manuaba, I Nyoman Setiawan

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.






















Moraref
PKP Index
Indonesia OneSearch
OCLC Worldcat
Index Copernicus
Scilit
