Workload-Aware Performance Evaluation of Sequential and Parallel DAG-Based Machine Learning Orchestration on Single-Node Systems

Authors

  • Krisna Shaadiq Nugroho Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Rama Aria Megantara Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Farrikh Alzami Faculty of Computer Science, Universitas Dian Nuswantoro
  • Firman Wahyudi Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Ricardus Anggi Pramunendar Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Novita Kurnia Ningrum Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Chaerul Umam Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • L.Budi Handoko Information Engineering, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia

DOI:

10.33395/sinkron.v10i1.15788

Keywords:

DAG scheduling, Machine learning pipelines, Orchestration overhead, Parallel execution, Single-node environment, Workflow orchestration

Abstract

The increasing adoption of machine learning in production systems has intensified the need for structured, automated, and reproducible pipelines, commonly modeled as directed acyclic graphs. However, unlike most existing workflow scheduling studies that focus on distributed or multi-node environments, this work addresses the lack of controlled, workload-aware analysis for orchestration strategies specifically in single-node systems. A controlled experimental methodology is applied by executing an identical machine learning pipeline under two orchestration modes: sequential execution using a local orchestrator and parallel execution using a workflow orchestration engine. Two scenarios are evaluated by explicitly controlling task execution duration to represent light and heavy computational workloads. The results show that under light workloads, parallel execution increases the average makespan significantly, yielding a speedup of only 0.71, which indicates performance degradation due to dominant orchestration overhead. In contrast, under heavy workloads, parallel execution reduces the average makespan from 1013 seconds to 532 seconds, achieving a speedup of 1.90. System-level monitoring reveals higher central processing unit utilization during parallel execution, while evaluation metrics, including root mean square error and coefficient of determination, remain stable across all experimental runs. This study contributes empirical evidence of a workload threshold beyond which orchestration overhead becomes negligible and pipeline parallelism becomes beneficial. These findings demonstrate that the performance benefits of parallelism are strictly workload-dependent, highlighting the importance of selecting orchestration strategies based on computational workload characteristics.

GS Cited Analysis

Downloads

Download data is not yet available.

References

Agnihotri, P., Koldehofe, B., Heinrich, R., Binnig, C., & Luthra, M. (2025). PDSP-BENCH: A Benchmarking System for Parallel and Distributed Stream Processing. Proceedings of the ACM SIGMOD/PODS International Conference on Management of Data. https://doi.org/10.1145/3722212.3725100

Albtoush, A., Yunus, F., Almi’ani, K., & Noor, N. M. M. (2023). Structure-Aware Scheduling Methods for Scientific Workflows in Cloud. Applied Sciences (Switzerland), 13(3). https://doi.org/10.3390/app13031980

Chen, C. Y. (2023). Scheduling coflows for minimizing the total weighted completion time in heterogeneous parallel networks. Journal of Parallel and Distributed Computing, 182. https://doi.org/10.1016/j.jpdc.2023.104752

Čop, A., Bertalanič, B., & Fortuna, C. (2025). An overview and solution for democratizing AI workflows at the network edge. Journal of Network and Computer Applications, 239. https://doi.org/10.1016/j.jnca.2025.104180

Corodescu, A. A., Nikolov, N., Khan, A. Q., Soylu, A., Matskin, M., Payberah, A. H., & Roman, D. (2021). Big data workflows: Locality-aware orchestration using software containers. Sensors, 21(24). https://doi.org/10.3390/s21248212

Davami, F., Adabi, S., Rezaee, A., & Rahmani, A. M. (2021). Distributed scheduling method for multiple workflows with parallelism prediction and DAG prioritizing for time constrained cloud applications. Computer Networks, 201. https://doi.org/10.1016/j.comnet.2021.108560

Eladgham, A. A., Ziedan, N. I., & Ziedan, I. (2024). Scheduling Algorithms in Parallel Processing: A Survey. The Egyptian International Journal of Engineering Sciences and Technology, 0(0), 0–0. https://doi.org/10.21608/eijest.2024.325850.1294

Elshamy, A., Alquraan, A., & Al-Kiswany, S. (2023). A Study of Orchestration Approaches for Scientific Workflows in Serverless Computing. 34–40. https://doi.org/10.1145/3592533.3592809

Hu, B., Yang, X., & Zhao, M. (2023). Online energy-efficient scheduling of DAG tasks on heterogeneous embedded platforms. Journal of Systems Architecture, 140. https://doi.org/10.1016/j.sysarc.2023.102894

Jain, A., & Rajak, R. (2023). A systematic review of workflow scheduling techniques in a fog environment. In International Journal of Experimental Research and Review (Vol. 30, pp. 100–108). International Academic Publishing House (IAPH). https://doi.org/10.52756/ijerr.2023.v30.011

Karthik, G. M., Gupta, A., Rajeshgupta, S., Jha, A., Sivasangari, A., & Mishra, B. P. (2023). Efficient Task Scheduling in Cloud Environment Based On Dynamic Priority and Optimized Technique. 2023 International Conference on Artificial Intelligence and Smart Communication, AISC 2023, 1124–1129. https://doi.org/10.1109/AISC56616.2023.10085447

K.M., U., & Shukla, S. (2025). Energy and performance-aware workflow scheduler using dynamic virtual network resource optimization under edge-cloud platform. Computers and Electrical Engineering, 123. https://doi.org/10.1016/j.compeleceng.2025.110085

Kramer, J., & Lu, T. (2025). A Reproducible Framework for Benchmarking Machine Learning Operations (MLOps) Infrastructures: Comparing Bare-Metal and Orchestrated Machine Learning Workflows. Cureus Journal of Computer Science. https://doi.org/10.7759/s44389-025-08693-x

Krishnan, V., Utiramerur, S., Ng, Z., Datta, S., Snyder, M. P., & Ashley, E. A. (2021). Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays. BMC Bioinformatics, 22(1). https://doi.org/10.1186/s12859-020-03934-3

Nelavelli, S., & Augie, M. A. (2022). MLOps in the Enterprise Cloud: Orchestrating Machine Learning Pipelines with Kubernetes. Sarcouncil Journal of Engineering and Computer Sciences, 04(09). https://doi.org/10.5281/zenodo.17214373

Rodrigues, P., Corona, J., Antunes, M., & Aguiar, R. L. (2025). Optimising ML Pipeline Execution via Smart Task Placement. Electronics (Switzerland), 14(13). https://doi.org/10.3390/electronics14132555

Shwe, T., & Aritsugi, M. (2024). Optimizing Data Processing: A Comparative Study of Big Data Platforms in Edge, Fog, and Cloud Layers. Applied Sciences (Switzerland), 14(1). https://doi.org/10.3390/app14010452

Singh, A., Purawat, S., Rao, A., & Altintas, I. (2020). Modular performance prediction for scientific workflows using Machine Learning. Future Generation Computer Systems, 114, 1–14. https://doi.org/10.1016/j.future.2020.04.048

Sochat, V., Pottier, L., & Milroy, D. (2025). State Machine Orchestration of an HPC Workflow in Cloud. 2293–2304. https://doi.org/10.1145/3731599.3767583

Steidl, M., Felderer, M., & Ramler, R. (2023). The pipeline for the continuous development of artificial intelligence models – Current state of research and practice. Journal of Systems and Software, 199, 111615. https://doi.org/10.1016/j.jss.2023.111615

Stewart, R., Raith, A., & Sinnen, O. (2023). Optimising makespan and energy consumption in task scheduling for parallel systems. Computers and Operations Research, 154. https://doi.org/10.1016/j.cor.2023.106212

Tang, J., Liu, Y., Lin, K. yi, & Li, L. (2023). Process bottlenecks identification and its root cause analysis using fusion-based clustering and knowledge graph. Advanced Engineering Informatics, 55. https://doi.org/10.1016/j.aei.2022.101862

Theusch, F., & Heisterkamp, P. (2024). Comparative Analysis of Open-Source ML Pipeline Orchestration Platforms [Bachelor’s Thesis]. University of Trier.

Verucchi, M., Olmedo, I. S., & Bertogna, M. (2023). A survey on real-time DAG scheduling, revisiting the Global-Partitioned Infinity War. Real-Time Systems, 59(3), 479–530. https://doi.org/10.1007/s11241-023-09403-3

Yang, H., Zhao, S., Shi, X., Zhang, S., & Guo, Y. (2023). DAG Hierarchical Schedulability Analysis for Avionics Hypervisor in Multicore Processors. Applied Sciences (Switzerland), 13(5). https://doi.org/10.3390/app13052779

Yasmin, J., Wang, J. A., Tian, Y., & Adams, B. (2025). An empirical study of developers’ challenges in implementing Workflows as Code: A case study on Apache Airflow. Journal of Systems and Software, 219. https://doi.org/10.1016/j.jss.2024.112248

Downloads


Crossmark Updates

How to Cite

Nugroho, K. S., Megantara, R. A., Alzami, F., Wahyudi, F., Pramunendar, R. A., Ningrum, N. K., Umam, C., & Handoko, L. (2026). Workload-Aware Performance Evaluation of Sequential and Parallel DAG-Based Machine Learning Orchestration on Single-Node Systems. Sinkron : Jurnal Dan Penelitian Teknik Informatika, 10(1), 725-740. https://doi.org/10.33395/sinkron.v10i1.15788

Most read articles by the same author(s)

1 2 > >>