skip to main content
10.1145/3078597.3078613acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Parallel Stream Processing Against Workload Skewness and Variance

Authors Info & Claims
Published:26 June 2017Publication History

ABSTRACT

Key-based workload partitioning is a common strategy used in parallel stream processing engines, enabling effective key-value tuple distribution over worker threads in a logical operator. It is likely to generate poor balancing performance when workload variance occurs on the incoming data stream. This paper presents a new key-based workload partitioning framework, with practical algorithms to support dynamic workload assignment for stateful operators. The framework combines hash-based and explicit key-based routing strategies for workload distribution, which specifies the destination worker threads for a handful of keys and assigns the other keys with the hash function. When short-term distribution fluctuations occur to the incoming data stream, the system adaptively updates the routing table containing the chosen keys, in order to rebalance the workload with minimal migration overhead within the stateful operator. We formulate the rebalance operation as an optimization problem, with multiple objectives on minimizing state migration costs, controlling the size of the routing table and breaking workload imbalance among worker threads. Despite of the NP-hardness nature behind the optimization formulation, we carefully investigate and justify the heuristics behind key (re)routing and state migration, to facilitate fast response to workload variance with ignorable cost to the normal processing in the distributed system. Empirical studies on synthetic data and real-world stream applications validate the usefulness of our proposals.

References

  1. Apache Storm. In http://storm.apache.org/.Google ScholarGoogle Scholar
  2. D. Abadi, Y. Ahmad, M. Balazinska, and et al. 2005. The Design of the Borealis Stream Processing Engine. In CIDR. pages 277--289.Google ScholarGoogle Scholar
  3. Y. Ahmad and U. Cetintemel. 2004. Network-aware Query Processing for Stream-based Applications. In VLDB. pages 456--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Aniello, R. Baldoni, and L. Querzoni. 2013. Adaptive Online Scheduling in Storm. In DEBS. pages 207--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. P. Chen, H. Haussecker, A. Bovyrin, and et al. 2005. Computer Vision Workload Analysis: Case Study of Video Surveillance Systems. Intel Technology Journal 9, 2 (2005).Google ScholarGoogle Scholar
  6. D. Dewitt and J. Gray. 1992. Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35, 6 (1992), pages 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jianbing Ding, Tom Z. J. Fu, Richard T. B. Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang, and Hongyang Chao. 2015. Optimal Operator State Migration for Elastic Data Stream Processing. CoRR abs/1501.03619 (2015).Google ScholarGoogle Scholar
  8. M. Elseidy, A. Elguindy, A. Vitorovic, and C. Koch. 2014. Scalable and Adaptive Online Joins. VLDB 7, 6 (2014), pages 441--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tom Z. J. Fu, Jianbing Ding, Richard T. B. Ma, Marianne Winslett, Yin Yang, and Zhenjie Zhang. 2015a. DRS: dynamic resource scheduling for real-time analytics over fast streams. In Proceedings of the IEEE 35th International Conference on Distributed Computing Systems (ICDCS). pages 411--420.Google ScholarGoogle ScholarCross RefCross Ref
  10. Tom Z. J. Fu, Jianbing Ding, Richard T. B. Ma, Marianne Winslett, Yin Yang, Zhenjie Zhang, Yong Pei, and Bingbing Ni. 2015b. LiveTraj: Real-Time Trajectory Tracking over Live Video Streams. In Proc. of ACM Multimedia, Demo. pages 777--780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. B. Gedik. 2014. Partitioning Functions for Stateful Data Parallelism in Stream Processing. VLDBJ 23, 4 (2014), pages 517--539. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Gedik, S. Schneider, M. Hirzel, and K. Wu. 2014. Elastic Scaling for Data Stream Processing. IEEE Trans. Parallel Distrib. Syst. 25, 6 (2014), pages 1447--1463. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. David Karger, Eric Lehman, Tom Leighton, Rina Panigrahy, Matthew Levine, and Daniel Lewin. 1997. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In STOC. pages 654--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Narendra Karmarkar and Richard M Karp. 1982. An efficient approximation scheme for the one-dimensional bin-packing problem. In Foundations of Computer Science. pages 312--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Khandekar, K. Hildrum, S. Parekh, D. Rajan, J. Wolf, K. Wu, H. Andrade, and B. Gedik. 2009. COLA: Optimizing Stream Processing Applications via Graph Partitioning. In Middleware. pages 308--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Kulkarni, N. Bhagat, M. Fu, and et al. 2015. Twitter Heron: Stream Processing at Scale. In SIGMOD. pages 239--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mahendra Kutare, Greg Eisenhauer, Chengwei Wang, Karsten Schwan, Vanish Talwar, and Matthew. Wolf. 2010. Monalytics: Online Monitoring and Analytics for Managing Large Scale Data Centers. In ICAC. pages 141--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Kwon, M. Balazinska, B. Howe, and J. Rolia. 2012. Skewtune: Mitigating Skew in Mapreduce Applications. In SIGMOD. pages 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Q. Lin, B. C. Ooi, Z. Wang, and C. Yu. 2015. Scalable Distributed Stream Join Processing. In SIGMOD. pages 811--825. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Nasir, G. Morales, D. Garciasoriano, N. Kourtellis, and M. Serafini. 2015. The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines. In ICDE. pages 137--148.Google ScholarGoogle Scholar
  21. Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Nicolas Kourtellis, and Marco Serafini. 2016. When two choices are not enough: Balancing at scale in distributed stream processing. In ICDE. pages 589--600.Google ScholarGoogle Scholar
  22. M. Shah, J. Hellerstein, S. Chandrasekaran, and M. Franklin. 2003. Flux: An Adaptive Partitioning Operator for Continuous Query Systems. In ICDE. pages 25--36.Google ScholarGoogle Scholar
  23. A. Toshniwal, S. Taneja, A. Shukla, K. Ramasamy, J.M. Patel, S. Kulkarni, J. Jackson, K. Gade, M. Fu, J. Donham, and et al. 2014. Storm@ twitter. In SIGMOD. pages 147--156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. B. Ufler, N. Augsten, A. Reiser, and A. Kemper. 2012. Load Balancing in MapReduce Based on Scalable Cardinality Estimates. In ICDE. pages 522--533. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Walton, A. Dale, and R. Jenevein. 1991. A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. In VLDB. pages 537--548. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. Wolf, N. Bansal, K. Hildrum, S. Parekh, D. Rajan, R. Wagle, K. Wu, and L. Fleischer. 2008. SODA: An Optimizing Scheduler for Large-scale Stream-based Distributed Computer Systems. In Middleware. pages 306--325. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Wu and K. Tan. 2015. ChronoStream: Elastic Stateful Stream Computation in the Cloud. In ICDE. pages 723--734.Google ScholarGoogle Scholar
  28. Y. Xing, J. Hwang, U. Cetintemel, and S. Zdonik. 2006. Providing Resiliency to Load Variations in Distributed Stream Processing. In VLDB. pages 775--786. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Y. Xing, S. Zdonik, and J. Hwang. 2005. Dynamic Load Distribution in the Borealis Stream Processor. In ICDE. pages 791--802. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Xu, P. Kostamaa, X. Zhou, and L. Chen. 2008. Handling Data Skew in Parallel Joins in Shared-nothing Systems. In SIGMOD. pages 1043--1052. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker, and I. Stoica. 2013. Discretized Streams: Fault-tolerant Streaming Computation at Scale. In SOSP. pages 423--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Zhou, B. Ooi, and K. Tan. 2005. Dynamic Load Management for Distributed Continuous Query Systems. In ICDE. pages 322--323. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallel Stream Processing Against Workload Skewness and Variance

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          HPDC '17: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing
          June 2017
          254 pages
          ISBN:9781450346993
          DOI:10.1145/3078597

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 June 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          HPDC '17 Paper Acceptance Rate19of100submissions,19%Overall Acceptance Rate166of966submissions,17%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader