skip to main content
10.1145/2641483.2641530acmotherconferencesArticle/Chapter ViewAbstractPublication PagesuccsConference Proceedingsconference-collections
research-article

On the Dynamic Scheduling of Task Farm Patterns on a Heterogeneous CPU-GPGPU Environment

Published: 03 August 2014 Publication History

Abstract

Heterogeneous clusters and multi-core environments are gradually surpassing the homogeneous systems due to their high performance and flexibility. Task scheduling in these systems is an extensively studied subject. However, in a heterogeneous architecture consisting of multi-core CPUs and many-core GPGPUs (General Purpose Graphics Processor Units), task mapping becomes much more complex due to differences in architectures and programming models among the processors. Consequently, designing a scheduler which facilities a balanced distribution of loads by taking full advantage of the processing power of a CPU-GPGPU system is nontrivial. In this paper we discuss a multi-round scheduling algorithm and a scheduling framework for farm-pattern based applications on such a system. This is an important step towards designing a full-scale pattern-based scheduler to automatically and efficiently map the parallel tasks to the heterogeneous processors. As a proof of concept, we have designed a scheduling framework for the task-farm pattern based applications. The framework provides the necessary "separation of concerns" and hides the underlying complex scheduling details from the programmer. The experimental results demonstrate that our dynamic scheduling algorithm achieves better to similar performances as compared to some of the well-known scheduling algorithms for CPU-GPGPU systems.

References

[1]
Aiken, A. and Nicolau, A. Optimal Loop Parallelization. In Proceedings of the ACM SIGPLAN conference on Programming Language design and Implementation. New York, USA, 308--317, 1988.
[2]
Augonnet, C., Thibault, S., Namyst, R. and Wacrenier, P. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, vol. 23, pp. 187--198, 2011.
[3]
Bharadwaj, V., Ghose, D. and Robertazzi, T. G. Divisible load theory: A new paradigm for load scheduling in distributed systems. Cluster Computing, vol. 6, pp. 7--17, 2003.
[4]
Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L. L., Maheswaran, M., Reuther, A, I., Robertson, J. P., Theys, M. D., Yao, B. and Hensgen, D. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing, vol. 61, pp. 810--837, 2001.
[5]
Danelutto, M. Adaptive task farm implementation strategies. In proceeding of the 12th Euromicro Conference on Parallel, Distributed and Network-based Processing. A coruna. Spain, pp. 416--423, 2004.
[6]
Dean, J. and Ghemawat, S. Mapreduce: Simplified data processing on large clusters. In Proceeding of the 6th Symposium on Operating Systems Design & Implementation, San Francisco, USA, pp. 137--150, 2004.
[7]
González-Vélez, H. and Cole, M. Adaptive structured parallelism for distributed heterogeneous architectures: A methodological approach with pipelines and farms. Concurrency and Computation: Practice and Experience, vol. 22, pp. 2073--2094, 2010.
[8]
Grewe, D. and O'Boyle, M. F. A static task partitioning approach for heterogeneous systems using OpenCL. In Proceedings of the 20th international conference on Compiler construction, Saarbrucken, Germany, pp. 286--305, 2011.
[9]
Ibarra, O. H. and Kim, C. E. Heuristic algorithms for scheduling independent tasks on nonidentical processors. Journal of the ACM (JACM), vol. 24, pp. 280--289, 1977.
[10]
Jiménez, V. J., Vilanova, L., Gelado, I., Gil, M., Fursin, G. and Navarro, N. Predictive runtime code scheduling for heterogeneous architectures. In proceeding of the 4th International Conference on High Performance Embedded Architectures and Compilers, Crete, Greece, pp. 19--33, 2009.
[11]
Khronos. OpenCL -- The Open Standard for Parallel Programming of Heterogeneous Systems. Retrieved July 6, 2014 from "http://www.khronos.org/opencl".
[12]
Li, T., Narayana, V. K. and El-Ghazawi, T. A static task scheduling framework for independent tasks accelerated using a shared graphics processing unit. In proceeding of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Tainan, Taiwan, pp. 88--95, 2011.
[13]
Luk, C., Hong, S. and Kim, H. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, USA, pp. 45--55, 2009.
[14]
Maheswaran, M., Ali, S., Siegel, H. J., Hensgen, D. and Freund, R. F. Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. Journal of Parallel and Distributed Computing, vol. 59, pp. 107--131, 1999.
[15]
Mattson, T.G., Sanders, B.A. and Massingill, B.L. Patterns for Parallel Programming. Addison-Wesley Longman, Boston, USA, 2004.
[16]
Nvidia. CUDA. Retrieved July 6, 2014 from "http://www.nvidia.com/cuda".
[17]
Plauger, P., Lee, M., Musser, D.R. and Stepanov, A.A. The C++ Standard Template Library. Prentice Hall, 2000.
[18]
Ravi, V.T., Ma, W., Chiu, D. and Agrawal, G. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations, in Proceedings of the 24th ACM International Conference on Supercomputing, Tsukuba, Japan, pp. 137--146, 2010.
[19]
Reinder, J. Intel Thread Building Blocks. Retrieved July 6, 2014 from "http://threadingbuildingblocks.org".
[20]
Scogland, T.R., Rountree, B., Feng, W. and de Supinski, B.R. Heterogeneous task scheduling for accelerated openmp. In proceeding of the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Alaska, USA, pp. 144--155, 2012.
[21]
Yang, Y., Van Der Raadt, K. and Casanova, H. Multiround algorithms for scheduling divisible loads. IEEE Transactions On Parallel and Distributed Systems, vol. 16, pp. 1092--1102, 2005.
[22]
Yu, D. and Robertazzi, T.G. Divisible load scheduling for grid computing. In Proceeding of the 15th International Conference on Parallel and Distributed Computing and Systems (IASTED), Marina del Ray, USA, pp. 1--6, 2003.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
C3S2E '14: Proceedings of the 2014 International C* Conference on Computer Science & Software Engineering
August 2014
201 pages
ISBN:9781450327121
DOI:10.1145/2641483
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Scheduling on a heterogeneous environment
  2. divisible load
  3. farm pattern
  4. load balancing
  5. many-core GPGPU
  6. multicore CPU
  7. scheduling framework

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

C3S2E '14

Acceptance Rates

Overall Acceptance Rate 12 of 42 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 100
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media