skip to main content
research-article

Parallel Cost Analysis

Published:20 November 2018Publication History
Skip Abstract Section

Abstract

This article presents parallel cost analysis, a static cost analysis targeting to over-approximate the cost of parallel execution in distributed systems. In contrast to the standard notion of serial cost, parallel cost captures the cost of synchronized tasks executing in parallel by exploiting the true concurrency available in the execution model of distributed processing. True concurrency is challenging for static cost analysis, because the parallelism between tasks needs to be soundly inferred, and the waiting and idle processor times at the different locations need to be accounted for. Parallel cost analysis works in three phases: (1) it performs a block-level analysis to estimate the serial costs of the blocks between synchronization points in the program; (2) it then constructs a distributed flow graph (DFG) to capture the parallelism, the waiting, and idle times at the locations of the distributed system; and (3) the parallel cost can finally be obtained as the path of maximal cost in the DFG. We prove the correctness of the proposed parallel cost analysis, and provide a prototype implementation to perform an experimental evaluation of the accuracy and feasibility of the proposed analysis.

References

  1. WCET tools. 2012. Retrieved on 1 May, 2018 from http://www.rapitasystems.com/WCET-Tools.Google ScholarGoogle Scholar
  2. S. Agarwal, R. Barik, V. Sarkar, and R. K. Shyamasundar. 2007. May-happen-in-parallel analysis of X10 programs. In Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’07). ACM, New York, 183--193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Agha. 1986. Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarCross RefCross Ref
  4. A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Addison-Wesley Longman Publishing Co., Inc., Boston. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Albert, P. Arenas, J. Correas, S. Genaim, M. Gómez-Zamalloa, G. Puebla, and G. Román-Díez. 2015. Object-sensitive cost analysis for concurrent objects. Software Testing, Verification and Reliability 25, 3 (2015), 218--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Albert, P. Arenas, A. Flores-Montoya, S. Genaim, M. Gómez-Zamalloa, E. Martin-Martin, G. Puebla, and G. Román-Díez. 2014. SACO: Static analyzer for concurrent objects. In Proceedings of the 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’14), Lecture Notes in Computer Science, Vol. 8413, Springer, 562--567.Google ScholarGoogle ScholarCross RefCross Ref
  7. E. Albert, P. Arenas, S. Genaim, and G. Puebla. 2011. Closed-form upper bounds in static cost analysis. Journal of Automated Reasoning 46, 2 (2011), 161--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Albert, P. Arenas, S. Genaim, G. Puebla, and D. Zanardini. 2012. Cost analysis of object-oriented bytecode programs. Theoretical Computer Science (Special Issue on Quantitative Aspects of Programming Languages) 413, 1 (2012), 142--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. E. Albert, P. Arenas, M. Gómez-Zamalloa, and P. Y. Wong. 2013. aPET: A test case generation tool for concurrent objects. In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, (ESEC/FSE’13), Saint Petersburg, Russian Federation, August 18-26, 2013, B. Meyer, L. Baresi, and M. Mezini (Eds.). ACM, 595--598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. E. Albert, J. Correas, E. B. Johnsen, and G. Román-Díez. 2015. Parallel cost analysis of distributed systems. In Proceedings of 22nd International Symposium Static Analysis (SAS’15). Lecture Notes in Computer Science, Vol. 9291, Springer, 275--292.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Albert, J. Correas, and G. Román-Díez. 2014. Peak cost analysis of distributed systems. In Proceedings 21st International Symposium of Static Analysis (SAS’14). Lecture Notes in Computer Science, Vol. 8723, 18--33.Google ScholarGoogle ScholarCross RefCross Ref
  12. E. Albert, J. Correas, and G. Román-Díez. 2015. Non-cumulative resource analysis. In Proceedings of 21st International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’15). Lecture Notes in Computer Science, Vol. 9035, Springer, 85--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Albert, A. Flores-Montoya, and S. Genaim. 2012. Analysis of may-happen-in-parallel in concurrent objects. In Proceedings of Formal Techniques for Distributed Systems Joint 14th IFIP WG 6.1 International Conference, FMOODS 2012 and 32nd IFIP WG 6.1 International Conference (FORTE’12). Lecture Notes in Computer Science, Vol. 7273, Springer, 35--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Albert, A. Flores-Montoya, S. Genaim, and E. Martin-Martin. 2016. May-happen-in-parallel analysis for actor-based concurrency. ACM Transactions on Computational Logic, 17, 2 (Mar. 2016), 11:1--11:39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Armstrong. 2007. Programming Erlang: Software for a Concurrent World. Pragmatic Bookshelf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. G. Baker Jr. and C. Hewitt. 1977. The incremental garbage collection of processes. In Proceedings of the Symposium on Artificial Intelligence and Programming Languages. ACM Press, New York, 55--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. E. Bal, J. G. Steiner, and A. S. Tanenbaum. 1989. Programming languages for distributed computing systems. ACM Computing Surveys 21, 3 (1989), 261--322. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Barik. 2005. Efficient computation of may-happen-in-parallel information for concurrent java programs. In Proceedings of the Languages and Compilers for Parallel Computing, 18th International Workshop (LCPC’05), Hawthorne, NY, October 20-22, 2005, Revised Selected Papers, E. Ayguadé, G. Baumgartner, J. Ramanujam, and P. Sadayappan (Eds.). Lecture Notes in Computer Science, Vol. 4339. Springer, 152--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Brandauer, E. Castegren, D. Clarke, K. Fernandez-Reyes, E. B. Johnsen, K. I. Pun, S. L. Tapia Tarifa, T. Wrigstad, and A. M. Yang. 2015. Parallel objects for multicores: A glimpse at the parallel language encore. In Formal Methods for Multicore Programming, Lecture Notes in Computer Science, Vol. 9104, Springer, 1--56.Google ScholarGoogle Scholar
  20. D. Caromel and L. Henrio. 2005. A Theory of Distributed Object. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Caromel and Y. Roudier. 1996. Reactive programming in Eiffel//. In Proceedings of the Conference on Object-Based Parallel and Distributed Computation (OBPDC’95), J. Briot, J. Geib, and A. Yonezawa (Eds.). Lecture Notes in Computer Science, Vol. 1107. Springer, 125--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Cugola and C. Ghezzi. 1997. CJava: Introducing concurrent objects in Java. In Proceedings of the 4th International Conference on Object Oriented Information Systems (OOIS’97), M. E. Orlowska and R. Zicari (Eds.). Springer, 504--514.Google ScholarGoogle Scholar
  23. C. Curtsinger and E. D. Berger. 2015. Coz: Finding code that counts with causal profiling. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, 184--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. S. de Boer, V. Serbanescu, R. Hähnle, L. Henrio, J. Rochas, C. C. Din, E. B. Johnsen, M. Sirjani, E. Khamespanah, K. Fernandez-Reyes, and A. M. Yang. 2017. A survey of active object languages. ACM Computing Surveys 50, 5 (2017), 76:1--76:39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Di, Y. Sui, D. Ye, and J. Xue. 2015. Region-based may-happen-in-parallel analysis for C programs. In Proceedings of the 44th International Conference on Parallel Processing (ICPP’15). IEEE Computer Society, 889--898. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Farzan, Z. Kincaid, and A. Podelski. 2013. Inductive data flow graphs. In Proceedings of POPL 2013: 40th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 129--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. L. Graham, P. B. Kessler, and M. K. McKusick. 1982. gprof: A call graph execution profiler (with retrospective). In 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation 1979--1999, A Selection, K. S. McKinley (Ed.). ACM, 49--57.Google ScholarGoogle Scholar
  28. S. Gulwani, K. K. Mehra, and T. M. Chilimbi. 2009. Speed: Precise and efficient static estimation of program computational complexity. In Proceedings of the 36th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’09). ACM, 127--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. P. Haller and M. Odersky. 2009. Scala actors: Unifying thread-based and event-based programming. Theoretical Computer Science 410, 2--3 (2009), 202--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. H. Halstead, Jr. 1985. MULTILISP: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems (TOPLAS) 7, 4 (Oct. 1985), 501--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Hoffmann, K. Aehlig, and M. Hofmann. 2011. Multivariate amortized resource analysis. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’11). ACM, 357--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Hoffmann and Z. Shao. 2015. Automatic static cost analysis for parallel programs. In Programming Languages and Systems —24th European Symposium on Programming, ESOP 2015, Lecture Notes in Computer Science, Vol. 9032. Springer, 132--157. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. K. Hollingsworth and B. P. Miller. 1994. Slack: A New Performance Metric for Parallel Programs. Technical report, University of Maryland and University of Wisconsin-Madison, 1994.Google ScholarGoogle Scholar
  34. E. B. Johnsen, R. Hähnle, J. Schäfer, R. Schlatte, and M. Steffen. 2011. ABS: A core language for abstract behavioral specification. In Proceedings of 9th International Symposium on Formal Methods for Components and Objects (FMCO’10). Lecture Notes in Computer Science, Vol. 6957. Springer, 142--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. E. B. Johnsen and O. Owe. 2007. An asynchronous communication model for distributed concurrent objects. Software and System Modeling 6, 1 (Mar. 2007), 35--58.Google ScholarGoogle ScholarCross RefCross Ref
  36. E. B. Johnsen, O. Owe, and M. Arnestad. 2003. Combining active and reactive behavior in concurrent objects. In Proceedings of the Norwegian Informatics Conference (NIK’03), D. Langmyhr (Ed.). Tapir Academic Publisher, 193--204.Google ScholarGoogle Scholar
  37. J. E. Kelley, Jr. 1961. Critical-path planning and scheduling: Mathematical basis. Operations Research 9 (1961), 296--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. K. Lee, J. Palsberg, and R. Majumdar. 2010. Complexity results for may-happen-in-parallel analysis. Retrieved on 1 May, 2018 from web.cs.ucla.edu/∼palsberg/draft/lpm10.pdf.Google ScholarGoogle Scholar
  39. B. Liskov and L. Shrira. 1988. Promises: Linguistic support for efficient asynchronous procedure calls in distributed systems. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI’88), R. L. Wexelblat (Ed.). ACM Press, New York, 260--267. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. McCool, J. Reinders, and A. Robison. 2012. Structured Parallel Programming: Patterns for Efficient Computation. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. A. Milanova, A. Rountev, and B. G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Transactions on Software Engineering and Methodology 14 (2005), 1--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. B. P. Miller, M. Clark, J. K. Hollingsworth, S. Kierstead, S. Lim, and T. Torzewski. 1990. IPS-2: The second generation of a parallel program measurement system. IEEE Transactions on Parallel and Distributed Systems 1, 2 (1990), 206--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. B. P. Miller and C. Yang. 1987. IPS: An interactive and automatic performance measurement tool for parallel and distributed programs. In Proceedings of the 7th International Conference on Distributed Computing Systems. IEEE Computer Society, 482--489.Google ScholarGoogle Scholar
  44. M. Shapiro and S. Horwitz. 1997. Fast and accurate flow-insensitive points-to analysis. In Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’97), Paris, France, January 1997. ACM, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Y. Smaragdakis, M. Bravenboer, and O. Lhoták. 2011. Pick your contexts well: Understanding object-sensitivity. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’11). ACM, 17--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. M. Sridharan and R. Bodík. 2006. Refinement-based context-sensitive points-to analysis for Java. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI’06). 387--400. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. H. Sutter and J. R. Larus. 2005. Software and the concurrency revolution. ACM Queue 3, 7 (2005), 54--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. E. Tarjan. 1981. Fast algorithms for solving path problems. Journal of the ACM (JACM) 28, 3 (July 1981), 594--614. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. B. Wegbreit. 1975. Mechanical program analysis. Communications ACM 18, 9 (1975), 528--539. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. J. Whaley and M. S. Lam. 2004. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation (PLDI’04). ACM, New York, NY, 131--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Wyatt. 2013. Akka Concurrency. Artima. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. J. Yi, C. Sadowski, S. N. Freund, and C. Flanagan. 2012. Cooperative concurrency for a multicore world (extended abstract). In Proceedings of the International Conference on Runtime Verification (RV’11), Lecture Notes in Computer Science, Vol. 7186. Springer, 342--344. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. A. Yoga and S. Nagarakatte. 2017. A fast causal profiler for task parallel programs. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Y. Yokote and M. Tokoro. 1987. Concurrent programming in concurrentsmalltalk. In Object-Oriented Concurrent Programming, A. Yonezawa and M. Tokoro (Eds.). MIT Press, 129--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. A. Yonezawa, J. Briot, and E. Shibayama. 1986. Object-oriented concurrent programming in ABCL/1. In ACM International Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’86), N. K. Meyrowitz (Ed.). ACM Press, Nov. 1986, 258--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. F. Zuleger, S. Gulwani, M. Sinn, and H. Veith. 2011. Bound analysis of imperative programs with the size-change abstraction. In Proceedings of the 18th International Conference on Static Analysis (SAS’11), Lecture Notes in Computer Science, Vol. 6887, Springer, 280--297. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallel Cost Analysis

        Recommendations

        Reviews

        Srini Ramaswamy

        In this paper, the authors present "static cost analysis for distributed systems that exploits the parallelism among distributed locations to infer a more precise estimation of the parallel cost." Such a parallel cost analysis can be of use when applications can exploit the parallelism available at distributed locations. Here, the authors present a modification of a cooperative concurrency model, a model that is being increasingly adopted for distributed and multi-core systems as it allows for some restrictions, for example, limiting interleaving analysis to synchronization points explicit in the program. This contrasts with the more widely used multi-threaded concurrency models that consider "an exponential explosion in the number of traces [to study]." To enable cooperative concurrency analysis, the authors compute "with different levels of precision: (a) the naive approach, using the set of nodes contained in the path expressions; (b) unfolding all disjunctions of the path expression; and (c) filtering infeasible paths." The paper include a small practical example: "a program-consisting of ~1,400 lines of code-which models a supermarket cash desk line." Future work will incorporate analysis information about scheduling policies, which could be different at different locations. Experimental results indicate that the parallel cost analysis presented in this paper is feasible and accurate, and could potentially provide "significant [cost] improvements with respect to the serial cost analysis," reaching up to ~50 percent when combined with the filtering of infeasible paths. In another scenario, they "show that the naive approach and the unfolding disjunctions approach [perform at] 75 percent and 73.4 percent of the sequential upper bound, respectively," while the filtering paths approach takes much longer. Thus, the application of this method requires "tradeoff[s] between the precision that can be achieved and the computational time to obtain the upper bound." To summarize, distributed and multi-core computing are becoming foundational to modern-day computing, with software systems expected to proactively support massively parallel execution. While traditional analysis of parallelism has focused on exploiting parallelism among independent tasks, in this work the authors explore "a model of computation that separates the asynchronous spawning of new tasks to different locations for task processing from the synchronization between these tasks." While this concept is intuitive, it is nice to see it applied in a systematic way.

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Computational Logic
          ACM Transactions on Computational Logic  Volume 19, Issue 4
          October 2018
          277 pages
          ISSN:1529-3785
          EISSN:1557-945X
          DOI:10.1145/3293495
          • Editor:
          • Orna Kupferman
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 November 2018
          • Accepted: 1 August 2018
          • Revised: 1 June 2018
          • Received: 1 October 2017
          Published in tocl Volume 19, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format