ABSTRACT
The Intercept Layer for OpenCL™ Applications is a recently released open source middleware layer to assist debugging, analyzing, and optimizing OpenCL applications. It fills a key gap in the OpenCL development ecosystem, requires no driver or application modifications, and has been tested on OpenCL implementations from multiple vendors.
This Technical Presentation will introduce the Intercept Layer for OpenCL Applications, describe how it works, some of its capabilities, and some of its limitations. The talk will close with a discussion of features that could be added or moved to middleware layers like the Intercept Layer for OpenCL Applications, possible additions to the OpenCL standard that would simplify development of new features or enable new functionality, and a call for contributions.
- The LuxMark Authors. 2012. LuxMark v2.0. (2012). http://www.luxrender.net/wiki/LuxMarkGoogle Scholar
- Intel Corporation. 2017. Intel VTune Amplifier XE. (2017). https://software.intel.com/en-us/intel-vtune-amplifier-xeGoogle Scholar
- Intel Corporation. 2018. Intercept Layer for OpenCL Applications. https://github.com/intel/opencl-intercept-layer. (2018).Google Scholar
- Khronos OpenCL Working Group. 2018. OpenCL ICD Loader. https://github.com/KhronosGroup/OpenCL-ICD-Loader. (2018).Google Scholar
- Khronos Vulkan Working Group. 2018. Vulkan Loader and Validation Layers. https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers. (2018).Google Scholar
- Microsoft. 2018. Using the debug layer to debug apps. https://msdn.microsoft.com/en-us/library/windows/desktop/jj200584(v=vs.85).aspx. (2018). Accessed: April 2018.Google Scholar
Index Terms
- Debugging and Analyzing Programs Using the Intercept Layer for OpenCL Applications
Recommendations
Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consumption
ARMS-CC '17: Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud ComputingMany modern parallel computing systems are heterogeneous at their node level. Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or Intel Xeon Phi) that provide high performance with suitable energy-consumption characteristics. ...
Generating OpenCL C kernels from OpenACC
IWOCL '14: Proceedings of the International Workshop on OpenCL 2013 & 2014Hardware accelerators are now a common way to improve the performances of compute nodes. This performance improvement has a cost: applications need to be rewritten to take advantage of the new hardware. OpenACC is a set of compiler directives to target ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Comments