research-article

On the Dynamic Scheduling of Task Farm Patterns on a Heterogeneous CPU-GPGPU Environment

Authors:

Dhrubajyoti Goswami,

Bahareh GoodarziAuthors Info & Claims

C3S2E '14: Proceedings of the 2014 International C* Conference on Computer Science & Software Engineering

Article No.: 7, Pages 1 - 7

https://doi.org/10.1145/2641483.2641530

Published: 03 August 2014 Publication History

Abstract

Heterogeneous clusters and multi-core environments are gradually surpassing the homogeneous systems due to their high performance and flexibility. Task scheduling in these systems is an extensively studied subject. However, in a heterogeneous architecture consisting of multi-core CPUs and many-core GPGPUs (General Purpose Graphics Processor Units), task mapping becomes much more complex due to differences in architectures and programming models among the processors. Consequently, designing a scheduler which facilities a balanced distribution of loads by taking full advantage of the processing power of a CPU-GPGPU system is nontrivial. In this paper we discuss a multi-round scheduling algorithm and a scheduling framework for farm-pattern based applications on such a system. This is an important step towards designing a full-scale pattern-based scheduler to automatically and efficiently map the parallel tasks to the heterogeneous processors. As a proof of concept, we have designed a scheduling framework for the task-farm pattern based applications. The framework provides the necessary "separation of concerns" and hides the underlying complex scheduling details from the programmer. The experimental results demonstrate that our dynamic scheduling algorithm achieves better to similar performances as compared to some of the well-known scheduling algorithms for CPU-GPGPU systems.

References

[1]

Aiken, A. and Nicolau, A. Optimal Loop Parallelization. In Proceedings of the ACM SIGPLAN conference on Programming Language design and Implementation. New York, USA, 308--317, 1988.

Digital Library

[2]

Augonnet, C., Thibault, S., Namyst, R. and Wacrenier, P. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, vol. 23, pp. 187--198, 2011.

Digital Library

[3]

Bharadwaj, V., Ghose, D. and Robertazzi, T. G. Divisible load theory: A new paradigm for load scheduling in distributed systems. Cluster Computing, vol. 6, pp. 7--17, 2003.

Digital Library

[4]

Braun, T.D., Siegel, H.J., Beck, N., Bölöni, L. L., Maheswaran, M., Reuther, A, I., Robertson, J. P., Theys, M. D., Yao, B. and Hensgen, D. A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. Journal of Parallel and Distributed Computing, vol. 61, pp. 810--837, 2001.

Digital Library

[5]

Danelutto, M. Adaptive task farm implementation strategies. In proceeding of the 12th Euromicro Conference on Parallel, Distributed and Network-based Processing. A coruna. Spain, pp. 416--423, 2004.

[6]

Dean, J. and Ghemawat, S. Mapreduce: Simplified data processing on large clusters. In Proceeding of the 6th Symposium on Operating Systems Design & Implementation, San Francisco, USA, pp. 137--150, 2004.

Digital Library

[7]

González-Vélez, H. and Cole, M. Adaptive structured parallelism for distributed heterogeneous architectures: A methodological approach with pipelines and farms. Concurrency and Computation: Practice and Experience, vol. 22, pp. 2073--2094, 2010.

Digital Library

[8]

Grewe, D. and O'Boyle, M. F. A static task partitioning approach for heterogeneous systems using OpenCL. In Proceedings of the 20th international conference on Compiler construction, Saarbrucken, Germany, pp. 286--305, 2011.

Digital Library

[9]

Ibarra, O. H. and Kim, C. E. Heuristic algorithms for scheduling independent tasks on nonidentical processors. Journal of the ACM (JACM), vol. 24, pp. 280--289, 1977.

Digital Library

[10]

Jiménez, V. J., Vilanova, L., Gelado, I., Gil, M., Fursin, G. and Navarro, N. Predictive runtime code scheduling for heterogeneous architectures. In proceeding of the 4th International Conference on High Performance Embedded Architectures and Compilers, Crete, Greece, pp. 19--33, 2009.

Digital Library

[11]

Khronos. OpenCL -- The Open Standard for Parallel Programming of Heterogeneous Systems. Retrieved July 6, 2014 from "http://www.khronos.org/opencl".

[12]

Li, T., Narayana, V. K. and El-Ghazawi, T. A static task scheduling framework for independent tasks accelerated using a shared graphics processing unit. In proceeding of the 17th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Tainan, Taiwan, pp. 88--95, 2011.

Digital Library

[13]

Luk, C., Hong, S. and Kim, H. Qilin: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, USA, pp. 45--55, 2009.

Digital Library

[14]

Maheswaran, M., Ali, S., Siegel, H. J., Hensgen, D. and Freund, R. F. Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. Journal of Parallel and Distributed Computing, vol. 59, pp. 107--131, 1999.

Digital Library

[15]

Mattson, T.G., Sanders, B.A. and Massingill, B.L. Patterns for Parallel Programming. Addison-Wesley Longman, Boston, USA, 2004.

Digital Library

[16]

Nvidia. CUDA. Retrieved July 6, 2014 from "http://www.nvidia.com/cuda".

[17]

Plauger, P., Lee, M., Musser, D.R. and Stepanov, A.A. The C++ Standard Template Library. Prentice Hall, 2000.

Digital Library

[18]

Ravi, V.T., Ma, W., Chiu, D. and Agrawal, G. Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations, in Proceedings of the 24th ACM International Conference on Supercomputing, Tsukuba, Japan, pp. 137--146, 2010.

Digital Library

[19]

Reinder, J. Intel Thread Building Blocks. Retrieved July 6, 2014 from "http://threadingbuildingblocks.org".

[20]

Scogland, T.R., Rountree, B., Feng, W. and de Supinski, B.R. Heterogeneous task scheduling for accelerated openmp. In proceeding of the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Alaska, USA, pp. 144--155, 2012.

Digital Library

[21]

Yang, Y., Van Der Raadt, K. and Casanova, H. Multiround algorithms for scheduling divisible loads. IEEE Transactions On Parallel and Distributed Systems, vol. 16, pp. 1092--1102, 2005.

Digital Library

[22]

Yu, D. and Robertazzi, T.G. Divisible load scheduling for grid computing. In Proceeding of the 15th International Conference on Parallel and Distributed Computing and Systems (IASTED), Marina del Ray, USA, pp. 1--6, 2003.

Index Terms

On the Dynamic Scheduling of Task Farm Patterns on a Heterogeneous CPU-GPGPU Environment
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

Adaptive heterogeneous scheduling for integrated GPUs
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

Many processors today integrate a CPU and GPU on the same die, which allows them to share resources like physical memory and lowers the cost of CPU-GPU communication. As a consequence, programmers can effectively utilize both the CPU and GPU to execute ...
Exploiting task and data parallelism for advanced video coding on hybrid CPU + GPU platforms

Considering the prevalent usage of multimedia applications on commodity computers equipped with both CPU and GPU devices, the possibility of simultaneously exploiting all parallelization capabilities of such hybrid platforms for high performance video ...
CPU-assisted GPGPU on fused CPU-GPU architectures
HPCA '12: Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture

This paper presents a novel approach to utilize the CPU resource to facilitate the execution of GPGPU programs on fused CPU-GPU architectures. In our model of fused architectures, the GPU and the CPU are integrated on the same die and share the on-chip ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

C3S2E '14: Proceedings of the 2014 International C* Conference on Computer Science & Software Engineering

August 2014

201 pages

ISBN:9781450327121

DOI:10.1145/2641483

General Chair:
Bipin C. Desai
Concordia University, Canada

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 August 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

C3S2E '14

C3S2E '14: International C* Conference on Computer Science & Software Engineering

August 3 - 5, 2014

QC, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 12 of 42 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
100
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten