research-article

GPUVerify: a verifier for GPU kernels

Authors:
Adam Betts

Imperial College London, London, United Kingdom

Imperial College London, London, United Kingdom
View Profile

,
Nathan Chong

Imperial College London, Lonon, United Kingdom

Imperial College London, Lonon, United Kingdom
View Profile

,
Alastair Donaldson

Imperial College London, London, United Kingdom

Imperial College London, London, United Kingdom
View Profile

,
Shaz Qadeer

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Paul Thomson

Imperial College London, London, United Kingdom

Imperial College London, London, United Kingdom
View Profile

OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applicationsOctober 2012Pages 113–132https://doi.org/10.1145/2384616.2384625

Published:19 October 2012Publication History

OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications

Pages 113–132

ABSTRACT

We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream kernel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational semantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in GPU kernels and allows kernel verification to be reduced to analysis of a sequential program, thereby completely avoiding the need to reason about thread interleavings, and allowing existing modular techniques for program verification to be leveraged. We describe an efficient encoding for data race detection and propose a method for automatically inferring loop invariants required for verification. We have implemented these techniques as a practical verification tool, GPUVerify, which can be applied directly to OpenCL and CUDA source code. We evaluate GPUVerify with respect to a set of 163 kernels drawn from public and commercial sources. Our evaluation demonstrates that GPUVerify is capable of efficient, automatic verification of a large number of real-world kernels.

References

AMD. AMD Accelerated Parallel Processing (APP) SDK. developer.amd.com/sdks/amdappsdk/pages/default.aspxGoogle Scholar
M. Barnett, B. Chang, R. DeLine, B. Jacobs, and K. R. M. Leino. Boogie: A modular reusable verifier for object-oriented programs. In FMCO, 2005. Google ScholarDigital Library
D. Beyer, T. A. Henzinger, R. Majumdar, and A. Rybalchenko. Path invariants. In PLDI, 2007. Google ScholarDigital Library
M. Boyer, K. Skadron, and W. Weimer. Automated dynamic analysis of CUDA programs. In STMCS, 2008.Google Scholar
C. Cadar, D. Dunbar, and D. Engler. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, 2008. Google ScholarDigital Library
P. Collingbourne, C. Cadar, and P. H. J. Kelly. Symbolic testing of OpenCL code. In HVC, 2011.Google Scholar
P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, 1977. Google ScholarDigital Library
L. de Moura and N. Bjørner. Z3: An efficient SMT solver. In TACAS, 2008. Google ScholarDigital Library
A. Donaldson, D. Kroening and P. Rummer. Automatic analysis of scratch-pad memory code for heterogeneous multicore processors. In TACAS, 2010. Google ScholarDigital Library
A. Donaldson, D. Kroening and P. Rummer. Automatic analysis of DMA races using model checking and k-induction. Formal Methods in System Design 39(1):83--113. Google ScholarDigital Library
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program. 69(1-3):35--45. Google ScholarDigital Library
C. Flanagan and K. R. M. Leino. Houdini, an annotation assistant for ESC/Java. In FME, 2001. Google ScholarDigital Library
S. Graf and H. Sa idi. Construction of abstract state graphs with PVS. In CAV, 1997. Google ScholarDigital Library
A. Habermaier and A. Knapp. On the correctness of the SIMT execution model of GPUs. In ESOP, 2012. Google ScholarDigital Library
T. Kahsai, Y. Ge, and C. Tinelli. Instantiation-based invariant discovery. In NFM, 2011. Google ScholarDigital Library
R. K. Karmani, P. Madhusudan, and B. Moore. Thread contracts for safe parallelism. In PPOPP, 2011. Google ScholarDigital Library
Khronos OpenCL Working Group. The OpenCL specifica-tion, version 1.1, 2011. Document Revision: 44.Google Scholar
S. K. Lahiri and S. Qadeer. Complexity and algorithms for monomial and clausal predicate. In CADE, 2009. Google ScholarDigital Library
K. R. M. Leino, G. Nelson, and J. B. Saxe. ESC/Java user's manual. Technical Note 2000-002, Compaq Systems Research Center, October 2000.Google Scholar
A. Leung, M. Gupta, Y. Agarwal, R. Gupta, R. Jhala, and S. Lerner. Verifying GPU kernels by test amplification. In PLDI, 2012. Google ScholarDigital Library
G. Li and G. Gopalakrishnan. Scalable SMT-based verifica-tion of GPU kernel functions. In FSE, 2010. Google ScholarDigital Library
G. Li, P. Li, G. Gopalakrishnan. Parametric flows: automated behaviour equivalencing for symbolic analysis of races in CUDA programs. In SC, 2012. Google ScholarDigital Library
G. Li, P. Li, G. Sawaya, G. Gopalakrishnan, I. Ghosh, and S. P. Rajan. GKLEE: concolic verification and test generation for GPUs. In PPOPP, 2012. Google ScholarDigital Library
llvm.org. clang: a C language family frontend for LLVM. clang.llvm.orgGoogle Scholar
A. Lokhmotov. Mobile and embedded computing on Mali GPUs. In UK GPU Computing Conference, 2011.Google Scholar
K. L. McMillan. Lazy abstraction with interpolants. In CAV, 2006. Google ScholarDigital Library
Microsoft Corporation. C++ AMP sample projects for download (MSDN blog). blogs.msdn.com/b/nativeconcurrency/archive/2012/01/30/c-amp-sample-projects-for-download.aspxGoogle Scholar
D. Moth and Y. Levanoni. Microsoft's C++ AMP unveiled. www.drdobbs.com/windows/231600761Google Scholar
NVIDIA. CUDA Toolkit Release Archive. developer.nvidia.com/cuda-toolkit-archiveGoogle Scholar
NVIDIA. CUDA C programming guide, v4.0, 2011.Google Scholar
NVIDIA. PTX: Parallel thread execution ISA, v2.3, 2011.Google Scholar
L. Nyland. Personal communication, April 2012.Google Scholar
L. Nyland, M. Harris, and J. Prins. Fast N-body simulation with CUDA. GPU Gems 3, Chapter 31. Addison-Wesley, 2007.Google Scholar
Rightware Oy. Basemark CL. www.rightware.com/en/Benchmarking+Software/Basemark%99+CLGoogle Scholar
S. Srivastava and S. Gulwani. Program verification using templates over predicate abstraction. In PLDI, 2009. Google ScholarDigital Library
B. Steensgaard. Points-to analysis in almost linear time. In POPL, 1996. Google ScholarDigital Library
S. Stone, J. Haldar, S. Tsao, W. Hwu, B. Sutton, and Z. Liang. Accelerating advanced MRI reconstructions on GPUs. J. Parallel Distrib. Comput., 68(10):1307--1318, 2008. Google ScholarDigital Library
S. Tripakis, C. Stergiou, and R. Lublinerman. Checking non-interference in SPMD programs. In HotPar, 2010.Google Scholar

Index Terms

GPUVerify: a verifier for GPU kernels
1. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning

Recommendations

The Design and Implementation of a Verification Technique for GPU Kernels

We present a technique for the formal verification of GPU kernels, addressing two classes of correctness properties: data races and barrier divergence. Our approach is founded on a novel formal operational semantics for GPU kernels termed <i>synchronous, ...
Read More
GPUVerify: a verifier for GPU kernels
OOPSLA '12

We present a technique for verifying race- and divergence-freedom of GPU kernels that are written in mainstream kernel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational semantics for GPU programming ...
Read More
Barrier invariants: a shared state abstraction for the analysis of data-dependent GPU kernels
OOPSLA '13: Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications

Data-dependent GPU kernels, whose data or control flow are dependent on the input of the program, are difficult to verify because they require reasoning about shared state manipulated by many parallel threads. Existing verification techniques for GPU ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
October 2012
1052 pages
ISBN:9781450315616
DOI:10.1145/2384616
General Chair:
Gary T. Leavens
University of Central Florida
,
Program Chair:
Matthew B. Dwyer
University of Nebraska - Lincoln
ACM SIGPLAN Notices Volume 47, Issue 10
OOPSLA '12
October 2012
1011 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2398857
Issue’s Table of Contents
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPUs
barrier synchronization
concurrency
data races
verification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate268of1,244submissions,22%
Upcoming Conference
SPLASH '24

Sponsor:

sigplan

ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity

October 20 - 25, 2024

Pasadena , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 141
  Total Citations
  View Citations
- 696
  Total Downloads
- Downloads (Last 12 months)51
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GPUVerify: a verifier for GPU kernels

OOPSLA '12: Proceedings of the ACM international conference on Object oriented programming systems languages and applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

The Design and Implementation of a Verification Technique for GPU Kernels

GPUVerify: a verifier for GPU kernels

Barrier invariants: a shared state abstraction for the analysis of data-dependent GPU kernels