A Systematic Literature Review of Machine Learning for Endurance Running Performance Prediction

Authors

  • Efraim William Solang Universitas Udayana
  • Linawati Universitas Udayana, Indonesia
  • Ida Bagus Gede Manuaba Universitas Udayana, Indonesia
  • I Nyoman Setiawan Universitas Udayana, Indonesia

DOI:

10.33395/sinkron.v10i1.15743

Keywords:

Machine learning, prediction of running time, running performance, sports analytics, systematic literature review

Abstract

This study systematically reviews the application of machine learning methods for predicting running performance, with particular emphasis on short-middle distance events such as the 5 km. Although machine learning based performance prediction has been widely explored in endurance sports, a comprehensive review synthesizing models, predictors, and pipelines across running distances remains limited. The review followed the PRISMA 2020 framework. Articles published between 2020 and 2025 were retrieved from ScienceDirect, Google Scholar, and PubMed using predefined keyword combinations related to machine learning and running performance. Studies were included if they focused on running (excluding cycling, triathlon, or other sports), applied predictive modeling, and reported model evaluation metrics. A total of 26 studies met the inclusion criteria and were assessed using quality appraisal criteria inspired by TRIPOD and QUADAS-2. The analysis identified four main research themes: (1) application of machine learning models for running performance prediction, (2) physiological and anthropometric predictors, (3) non-physiological and contextual factors, and (4) personalized athlete training and monitoring. Ensemble learning models (Random Forest, XGBoost, LightGBM) consistently outperformed traditional linear regression by capturing non-linear interactions, while deep learning approaches (LSTM, GRU) demonstrated strong capability in modeling temporal training dynamics. A generalized machine learning pipeline for running performance prediction was also synthesized. This review contributes a structured framework that integrates modeling approaches, predictor categories, and evaluation strategies, and highlights research opportunities for explainable and personalized prediction systems, particularly for 5 km running performance.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Alvero-Cruz, J. R., Carnero, E. A., García, M. A. G., Cárceles, F. A., Correas-Gómez, L., Rosemann, T., Nikolaidis, P. T., & Knechtle, B. (2020). Predictive performance models in long-distance runners: A narrative review. In International Journal of Environmental Research and Public Health (Vol. 17, Issue 21, pp. 1–22). MDPI. https://doi.org/10.3390/ijerph17218289

Azzahra, F. S. P., & Prianto, C. (2025). Tinjauan Literatur Sistematis: Analisis Implementasi Kecerdasan Buatan untuk Verifikasi Dokumen. Jurnal Ilmiah Sistem Informasi, 4(3), 417–430. https://doi.org/10.51903/kjjwk708

Casado, A., Hanley, B., Santos-Concejero, J., & Ruiz-P ´ Erez, L. M. (2019). World-Class Long-Distance Running Performances Are Best Predicted by Volume of Easy Runs and Deliberate Practice of Short-Interval and Tempo Runs. www.nsca.com

Collins, G. S., Reitsma, J. B., Altman, D. G., & Moons, K. G. M. (2015). Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. BMC Medicine, 13(1). https://doi.org/10.1186/s12916-014-0241-z

Coquart, J. B. (2023). Prediction of performance in a 100-km run from a simple equation. PLoS ONE, 18(3 March). https://doi.org/10.1371/journal.pone.0279662

Dash, S. (2024a). Win Your Race Goal: A Generalized Approach to Prediction of Running Performance. Sports Medicine International Open, 08(CP). https://doi.org/10.1055/a-2401-6234

Dash, S. (2024b). Win Your Race Goal: A Generalized Approach to Prediction of Running Performance. Sports Medicine International Open, 08(CP). https://doi.org/10.1055/a-2401-6234

Feely, C., Caulfield, B., Lawlor, A., & Smyth, B. (2023). Modelling the Training Practices of Recreational Marathon Runners to Make Personalised Training Recommendations. UMAP 2023 - Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, 183–193. https://doi.org/10.1145/3565472.3592952

Figueiredo, D. H., Figueiredo, D. H., Manoel, F. de A., & Machado, F. A. (2021). Peak Running Velocity or Critical Speed Under Field Conditions: Which Best Predicts 5-km Running Performance in Recreational Runners? Frontiers in Physiology, 12. https://doi.org/10.3389/fphys.2021.680790

Fokkema, T., van Damme, A. A. D. N., Fornerod, M. W. J., de Vos, R. J., Bierma-Zeinstra, S. M. A., & van Middelkoop, M. (2020). Training for a (half-)marathon: Training volume and longest endurance run related to performance and running injuries. Scandinavian Journal of Medicine and Science in Sports, 30(9), 1692–1704. https://doi.org/10.1111/sms.13725

Garmin Indonesia. (2024). Garmin Ungkap Tren Kebugaran di Tahun 2024: Lari, Tenis dan Golf Jadi Olahraga Paling Digemari di Indonesia. Https://Www.Garmin.Co.Id/News/Press-Release/News-2024-Dec-Garmin-Connect-Data-Report/.

Indrisari, E., Febiansyah, H., & Adiwinoto, B. (2025). A Systematic Literature Review on the Application of Machine Learning for Predicting Stunting Prevalence in Indonesia (2020–2024). Jurnal Sisfokom (Sistem Informasi Dan Komputer), 14(3), 277–283. https://doi.org/10.32736/sisfokom.v14i3.2366

Johansson, M., Atterfors, J., & Lamm, J. (2023). Pacing Patterns of Half-Marathon Runners: An analysis of ten years of results from Gothenburg Half Marathon. International Journal of Computer Science in Sport, 22(1), 124–138. https://doi.org/10.2478/ijcss-2023-0014

Keogh, A., O’connor Sheridan, O., Mccaffrey, O., Dunne, S., Lally, A., & Doherty, C. (2020). The Determinants of Marathon Performance: An Observational Analysis of Anthropometric, Pre-race and In-race Variables. http://www.intjexersci.com

Knechtle, B., Weiss, K., Valero, D., Villiger, E., Nikolaidis, P. T., Andrade, M. S., Scheer, V., Cuk, I., Gajda, R., & Thuany, M. (2024). Using machine learning to determine the nationalities of the fastest 100-mile ultra-marathoners and identify top racing events. PLoS ONE, 19(8). https://doi.org/10.1371/journal.pone.0303960

Knopp, M., Guppy, F., Joyner, M., Muniz-Pardos, B., Wackerhage, H., Schönfelder, M., Pitsiladis, Y., & Ruiz, D. (2025). Validation of marathon performance model based on physiological factors in world-class East African runners: a case series. Translational Exercise Biomedicine, 2(1), 1–8. https://doi.org/10.1515/teb-2024-0016

Lerebourg, L., Saboul, D., Clémençon, M., & Coquart, J. B. (2022). Prediction of Marathon Performance using Artificial Intelligence. International Journal of Sports Medicine, 44(5), 352–360. https://doi.org/10.1055/a-1993-2371

Mantzios, K., Ioannou, L. G., Panagiotaki, Z. O. E., Ziaka, S., Périard, J. D., Racinais, S., Nybo, L., & Flouris, A. D. (2022). Effects of Weather Parameters on Endurance Running Performance: Discipline-specific Analysis of 1258 Races. Medicine and Science in Sports and Exercise, 54(1), 153–161. https://doi.org/10.1249/MSS.0000000000002769

Muñoz-Pérez, I., Castañeda-Babarro, A., Santisteban, A., & Varela-Sanz, A. (2024). Predictive performance models in marathon based on half-marathon, age group and pacing behavior. Sport Sciences for Health, 20(3), 797–810. https://doi.org/10.1007/s11332-023-01159-4

Nikolaidis, P. T., & Knechtle, B. (2023). Predictors of Half-Marathon Performance in Male Recreational Athletes. EXCLI Journal, 22, 559–566. https://doi.org/10.17179/excli2023-6198

Nikolaidis, P. T., Rosemann, T., & Knechtle, B. (2021). Development and Validation of Prediction Equation of “Athens Authentic Marathon” Men’s Race Speed. Frontiers in Physiology, 12. https://doi.org/10.3389/fphys.2021.682359

Olaya-Cuartero, J., Pueo, B., Villalon-Gasch, L., & Jiménez-Olmedo, J. M. (2023). Prediction of Half-Marathon Power Target using the 9/3-Minute Running Critical Power Test. Journal of Sports Science and Medicine, 22(3), 525–530. https://doi.org/10.52082/jssm.2023.525

Pirscoveanu, C. I., & Oliveira, A. S. (2024). Prediction of instantaneous perceived effort during outdoor running using accelerometry and machine learning. European Journal of Applied Physiology, 124(3), 963–973. https://doi.org/10.1007/s00421-023-05322-0

Quittmann, O. J., Foitschik, T., Vafa, R., Freitag, F. J., Sparmann, N., Nolte, S., & Abel, T. (2021). Is Maximal Lactate Accumulation Rate Promising for Improving 5000-m Prediction in Running? International Journal of Sports Medicine, 44(4), 268–279. https://doi.org/10.1055/a-1958-3876

Rothschild, J. A., Stewart, T., Kilding, A. E., & Plews, D. J. (2024). Predicting daily recovery during long-term endurance training using machine learning analysis. European Journal of Applied Physiology, 124(11), 3279–3290. https://doi.org/10.1007/s00421-024-05530-2

Thuany, M., Weiss, K., Valero, D., Villiger, E., Andrade, M. S., Nikolaidis, P. T., Scheer, V., de Lira, C. A. B., Vancini, R. L., Cuk, I., Braschler, L., Rosemann, T., & Knechtle, B. (2025). An analysis of the 6-h ultra-marathon race using a machine learning approach. Frontiers in Sports and Active Living, 7. https://doi.org/10.3389/fspor.2025.1577470

Tomaszewski, M., Lukanova-Jakubowska, A., Majorczyk, E., & Dzierżanowski, Ł. (2024). From data to decision: Machine learning determination of aerobic and anaerobic thresholds in athletes. PLoS ONE, 19(8). https://doi.org/10.1371/journal.pone.0309427

Vijay, S. A., Sivakumar, C., Kumar, P. V., Muralidharan, C. K., Rajkumar, K. V., Kannan, K. R., Pradeepa, M., Sivasankar, P., Mariam, A. A., & Anand, U. K. A. (2024). Lactate Threshold Training to Improve Long-Distance Running Performance: A Narrative Review. Montenegrin Journal of Sports Science and Medicine, 20(1), 19–29. https://doi.org/10.26773/mjssm.240303

Vos, L., Vergeer, R., Goulding, R., Weide, G., de Koning, J., Jaspers, R., & Zwaard, S. van der. (2024). Predicting physical performance after training: insights from machine learning using small samples. https://doi.org/10.21203/rs.3.rs-4707433/v1

Weiss, K., Valero, D., Villiger, E., Scheer, V., Thuany, M., Aidar, F. J., de Souza, R. F., Cuk, I., Nikolaidis, P. T., Rosemann, T., & Knechtle, B. (2024). Associations between environmental factors and running performance: An observational study of the Berlin Marathon. PLoS ONE, 19(10). https://doi.org/10.1371/journal.pone.0312097

Westwood, M., Deeks, J. J., Whiting, P. F., Weswood, M. E., Rutjes, A. W., Reitsma, J. B., Bossuyt, P. N., & Kleijnen, J. (2011). QUADAS-2: A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. Article in Annals of Internal Medicine. https://doi.org/10.1059/0003-4819-155-8-201110180-00009

Wiecha, S., Kasiak, P. S., Cieśliński, I., Maciejczyk, M., Mamcarz, A., & Śliż, D. (2022). Modeling Physiological Predictors of Running Velocity for Endurance Athletes. Journal of Clinical Medicine, 11(22). https://doi.org/10.3390/jcm11226688

Winandy Soesilo, F., & Suprapti Hendharta, C. (2025). Factors Affecting Runners’ Continuance Intention on Strava Running Apps Usage In Jakarta. In Jurnal Bisnis & Komunikasi (Vol. 12, Issue 3). https://doi.org/https://ojs.kalbis.ac.id/index.php/kalbisocio/en/article/download/4574/1161/14635

Zhou, S. (2025). DeepRace: GRU-Based Sequence Modeling Framework for Marathon Performance Prediction. Informatica (Slovenia), 49(28), 153–160. https://doi.org/10.31449/inf.v49i28.8818

Downloads


Crossmark Updates

How to Cite

Solang, E. W., Linawati, L., Manuaba, I. B. G. ., & Setiawan, I. N. . (2026). A Systematic Literature Review of Machine Learning for Endurance Running Performance Prediction . Sinkron : Jurnal Dan Penelitian Teknik Informatika, 10(1), 512-524. https://doi.org/10.33395/sinkron.v10i1.15743

Most read articles by the same author(s)