skip to main content
10.1145/2384616.2384625acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

GPUVerify: a verifier for GPU kernels

Published:19 October 2012Publication History

ABSTRACT

We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream kernel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational semantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in GPU kernels and allows kernel verification to be reduced to analysis of a sequential program, thereby completely avoiding the need to reason about thread interleavings, and allowing existing modular techniques for program verification to be leveraged. We describe an efficient encoding for data race detection and propose a method for automatically inferring loop invariants required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, which can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 163 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, automatic verification of a large number of real-world kernels.

References

  1. AMD. AMD Accelerated Parallel Processing (APP) SDK. developer.amd.com/sdks/amdappsdk/pages/default.aspxGoogle ScholarGoogle Scholar
  2. M. Barnett, B. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Beyer, T. A. Henzinger, R. Majumdar, and A. Rybalchenko. Path invariants. In PLDI, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Boyer, K. Skadron, and W. Weimer. Automated dynamic analysis of CUDA programs. In STMCS, 2008.Google ScholarGoogle Scholar
  5. C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Collingbourne, C. Cadar, and P. H. J. Kelly. Symbolic testing of OpenCL code. In HVC, 2011.Google ScholarGoogle Scholar
  7. P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Donaldson, D. Kroening and P. Rummer. Automatic analysis of scratch-pad memory code for heterogeneous multicore processors. In TACAS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Donaldson, D. Kroening and P. Rummer. Automatic analysis of DMA races using model checking and k-induction. Formal Methods in System Design 39(1):83--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1-3):35--45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Flanagan and K. R. M. Leino. Houdini, an annotation assistant for ESC/Java. In FME, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Graf and H. Sa idi. Construction of abstract state graphs with PVS. In CAV, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Habermaier and A. Knapp. On the correctness of the SIMT execution model of GPUs. In ESOP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Kahsai, Y. Ge, and C. Tinelli. Instantiation-based invariant discovery. In NFM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. K. Karmani, P. Madhusudan, and B. Moore. Thread contracts for safe parallelism. In PPOPP, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Khronos OpenCL Working Group. The OpenCL specifica-tion, version 1.1, 2011. Document Revision: 44.Google ScholarGoogle Scholar
  18. S. K. Lahiri and S. Qadeer. Complexity and algorithms for monomial and clausal predicate. In CADE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. R. M. Leino, G. Nelson, and J. B. Saxe. ESC/Java user's manual. Technical Note 2000-002, Compaq Systems Research Center, October 2000.Google ScholarGoogle Scholar
  20. A. Leung, M. Gupta, Y. Agarwal, R. Gupta, R. Jhala, and S. Lerner. Verifying GPU kernels by test amplification. In PLDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Li and G. Gopalakrishnan. Scalable SMT-based verifica-tion of GPU kernel functions. In FSE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. G. Li, P. Li, G. Gopalakrishnan. Parametric flows: automated behaviour equivalencing for symbolic analysis of races in CUDA programs. In SC, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Li, P. Li, G. Sawaya, G. Gopalakrishnan, I. Ghosh, and S. P. Rajan. GKLEE: concolic verification and test generation for GPUs. In PPOPP, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. llvm.org. clang: a C language family frontend for LLVM. clang.llvm.orgGoogle ScholarGoogle Scholar
  25. A. Lokhmotov. Mobile and embedded computing on Mali GPUs. In UK GPU Computing Conference, 2011.Google ScholarGoogle Scholar
  26. K. L. McMillan. Lazy abstraction with interpolants. In CAV, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Microsoft Corporation. C++ AMP sample projects for download (MSDN blog). blogs.msdn.com/b/nativeconcurrency/archive/2012/01/30/c-amp-sample-projects-for-download.aspxGoogle ScholarGoogle Scholar
  28. D. Moth and Y. Levanoni. Microsoft's C++ AMP unveiled. www.drdobbs.com/windows/231600761Google ScholarGoogle Scholar
  29. NVIDIA. CUDA Toolkit Release Archive. developer.nvidia.com/cuda-toolkit-archiveGoogle ScholarGoogle Scholar
  30. NVIDIA. CUDA C programming guide, v4.0, 2011.Google ScholarGoogle Scholar
  31. NVIDIA. PTX: Parallel thread execution ISA, v2.3, 2011.Google ScholarGoogle Scholar
  32. L. Nyland. Personal communication, April 2012.Google ScholarGoogle Scholar
  33. L. Nyland, M. Harris, and J. Prins. Fast N-body simulation with CUDA. GPU Gems 3, Chapter 31. Addison-Wesley, 2007.Google ScholarGoogle Scholar
  34. Rightware Oy. Basemark CL. www.rightware.com/en/Benchmarking+Software/Basemark%99+CLGoogle ScholarGoogle Scholar
  35. S. Srivastava and S. Gulwani. Program verification using templates over predicate abstraction. In PLDI, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. B. Steensgaard. Points-to analysis in almost linear time. In POPL, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. Stone, J. Haldar, S. Tsao, W. Hwu, B. Sutton, and Z. Liang. Accelerating advanced MRI reconstructions on GPUs. J. Parallel Distrib. Comput., 68(10):1307--1318, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. S. Tripakis, C. Stergiou, and R. Lublinerman. Checking non-interference in SPMD programs. In HotPar, 2010.Google ScholarGoogle Scholar

Index Terms

  1. GPUVerify: a verifier for GPU kernels

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
          October 2012
          1052 pages
          ISBN:9781450315616
          DOI:10.1145/2384616
          • cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 47, Issue 10
            OOPSLA '12
            October 2012
            1011 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/2398857
            Issue’s Table of Contents

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 October 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate268of1,244submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader