Abstract
In multi-server systems, selecting to which server to dispatch an arriving job is a key factor influencing system response time. One of the most widely studied policies is Join-the-Shortest-Queue (JSQ), which is known to minimize mean response time in certain settings [7]. Many variants on JSQ have been proposed, including JSQ-d, under which a job is dispatched to the shortest queue among d servers selected uniformly at random [3, 5]; Join-Idle-Queue (JIQ), under which the dispatcher knows which servers are idle but not the queue lengths of non-idle servers [2]; and others.
The vast majority of work analyzing JSQ and related policies makes a key assumption: that the system is homogeneous, meaning that all servers have the same speed. This assumption is inaccurate in most modern computer systems. Server heterogeneity can arise, e.g., when a server farm consists of several generations of hardware, or when many virtual machines contend for resources on the same physical machine. Unfortunately, the wealth of results about how best to dispatch in homogeneous systems does not translate well to heterogeneous systems. Policies like JSQ-d and JIQ, which can achieve near-optimal performance in homogeneous systems, can lead to unacceptably high response times and even instability in heterogeneous systems [4, 8].
- S. Banawan and N. Zeidat. A comparative study of load sharing in heterogeneous multicomputer systems. In Proceedings. 25th Annual Simulation Symposium, pages 22--31. IEEE, 1992.Google ScholarDigital Library
- Y. Lu, Q. Xie, G. Kliot, A. Geller, J. Larus, and A. Greenberg. Join-idle-queue: A novel load balancing algorithm for dynamically scalable web services. Performance Evaluation, 68(11):1056--1071, 2011.Google ScholarDigital Library
- M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Transactions on Parallel and Distributed Systems, 12(10):1094--1104, 2001.Google ScholarDigital Library
- A. Stolyar. Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems, 80(4):341--361, 2015.Google ScholarDigital Library
- N. Vvedenskaya, R. Dobrushin, and F. Karpelevich. Queueing system with selection of the shortest of two queues: An asymptotic approach. Problemy Peredachi Informatsii, 32(1):20--34, 1996.Google Scholar
- W. Whitt. Deciding which queue to join: Some counterexamples. Operations research, 34(1):55--62, 1986.Google ScholarDigital Library
- W. Winston. Optimality of the shortest line discipline. Journal of Applied Probability, 14(1):181--189, 1977.Google ScholarCross Ref
- X. Zhou, F. Wu, J. Tan, Y. Sun, and N. Shro". Designing low-complexity heavy-traffic delay-optimal load balancing schemes: Theory to algorithms. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 1(2):39, 2017.Google ScholarDigital Library
Index Terms
- Smart Dispatching in Heterogeneous Systems
Recommendations
Minimizing slowdown in heterogeneous size-aware dispatching systems
Performance evaluation reviewWe consider a system of parallel queues where tasks are assigned (dispatched) to one of the available servers upon arrival. The dispatching decision is based on the full state information, i.e., on the sizes of the new and existing jobs. We are ...
Minimizing slowdown in heterogeneous size-aware dispatching systems
SIGMETRICS '12: Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer SystemsWe consider a system of parallel queues where tasks are assigned (dispatched) to one of the available servers upon arrival. The dispatching decision is based on the full state information, i.e., on the sizes of the new and existing jobs. We are ...
Optimal state-free, size-aware dispatching for heterogeneous M/G/-type systems
Performance 2005We consider a cluster of heterogeneous servers, modeled as M/G/1 first-come first-serve queues with different processing speeds. A dispatcher that assigns jobs to the servers takes as input only the size of the arriving job and the overall job-size ...
Comments