New generation of industry open standard for cross-platform parallel programming delivers increased flexibility, functionality and performance
July 22nd 2013 SIGGRAPH - Anaheim, CA The Khronos Group today announced the ratification and public release of the OpenCL 2.0 provisional specification. OpenCL 2.0 is a significant evolution of the open, royalty-free standard that is designed to further simplify cross-platform, parallel programming while enabling a significantly richer range of algorithms and programming patterns to be easily accelerated. As the foundation for these increased capabilities, OpenCL 2.0 defines an enhanced execution model and a subset of the C11 and C++11 memory model, synchronization and atomic operations. The release of the specification in provisional form is to enable developers and implementers to provide feedback before specification finalization, which is expected within 6 months. The OpenCL 2.0 provisional specification and reference cards are available at www.khronos.org/opencl/.
The OpenCL working group has combined developer feedback with emerging hardware capabilities to create a state-ofthe-art parallel programming platform - OpenCL 2.0, said Neil Trevett, chair of the OpenCL working group, president of the Khronos Group and vice president of mobile content at NVIDIA. OpenCL continues to gather momentum on both desktop and mobile devices. In addition to enabling application developers it is providing foundational, portable acceleration for middleware libraries, engines and higher-level programming languages that need to take advantage of heterogeneous compute resources including CPUs, GPUs, DSPs and FPGAs.
Updates and additions to OpenCL 2.0 include:
- Shared Virtual Memory - Host and device kernels can directly share complex, pointer-containing data structures such as trees and linked lists, providing significant programming flexibility and eliminating costly data transfers between host and devices.
- Dynamic Parallelism - Device kernels can enqueue kernels to the same device with no host interaction, enabling flexible work scheduling paradigms and avoiding the need to transfer execution control and data between the device and host, often significantly offloading host processor bottlenecks.
- Generic Address Space - Functions can be written without specifying a named address space for arguments, especially useful for those arguments that are declared to be a pointer to a type, eliminating the need for multiple functions to be written for each named address space used in an application.
- Images - Improved image support including sRGB images and 3D image writes, the ability for kernels to read from and write to the same image, and the creation of OpenCL images from a mip-mapped or a multi-sampled OpenGL texture for improved OpenGL interop.
- C11 Atomics - A subset of C11 atomics and synchronization operations to enable assignments in one work-item to be visible to other work-items in a work-group, across work-groups executing on a device or for sharing data between the OpenCL device and host.
- Pipes - Pipes are memory objects that store data organized as a FIFO and OpenCL 2.0 provides built-in functions for kernels to read from or write to a pipe, providing straightforward programming of pipe data structures that can be highly optimized by OpenCL implementers.
- Android Installable Client Driver Extension - Enables OpenCL implementations to be discovered and loaded as a shared object on Android systems.
OpenCL SPIR 1.2 Provisional Specification
In addition, the OpenCL Working Group also today released the OpenCL SPIR 1.2 provisional specification for public review. SPIR stands for Standard Portable Intermediate Representation and is a portable non-source representation for OpenCL 1.2 device programs. It enables application developers to avoid shipping kernel source and to manage the proliferation of devices and drivers from multiple vendors. OpenCL SPIR will enable consumption of code from third party compiler front-ends for alternative languages, such as C++, and is based on LLVM 3.2. Khronos has contributed open source patches for Clang 3.2 to enable SPIR code generation.
These 2 new OpenCL specifications will allow software developers to accelerate a much wider variety of applications on a greater range of devices than previously possible. OpenCL 2.0 will allow applications to process more complex data and algorithms in parallel than was possible in previous standards, while OpenCL SPIR will allow a variety of different programming languages to be compiled directly into OpenCL code for heterogeneous systems, said Andrew Richards, CEO of Codeplay. These are 2 big steps forwards to enable software developers to embrace heterogeneous platforms and Codeplay is actively involved in developing for both already.
Intel has been deeply involved in shaping new OpenCL 2.0 features like Shared Virtual Memory and OpenCL SPIR, said Jonathan Khazam, vice president and general manager of Intel's Visual & Parallel Computing Group. We are very excited about the improved programmability of OpenCL 2.0 and the potential to create new experiences with Intel® Iris Graphics Products.
Tony King-Smith, EVP marketing for Imagination Technologies, said: As a long-standing Promoter, Imagination is delighted to see Khronos release this major upgrade to the OpenCL API standard. We see an ever widening portfolio of markets relevant to OpenCL, from mobile and consumer multimedia-rich devices through automotive infotainment up to advanced cloud servers and supercomputers. OpenCL is gaining traction among our customers as a means to deliver high-performance compute on our widely deployed PowerVR GPUs as well as our MIPS CPUs. Indeed we have been among the first to enable OpenCL in GPUs for mobile and embedded SoC devices already in production, including some of the leading smartphones and tablets shipping today. We have also been one of the first to demonstrate the significant power saving advantages of OpenCL on GPU running alongside OpenGL ES in real applications a benefit often overlooked by application developers today. We look forward to continued industry momentum behind OpenCL as a key enabling API for GPU compute and heterogeneous processing.
The ability to perform compute-intensive tasks in parallel, using virtually any processor present in the device opens the door for significant performance and functionality improvements in several industries from Automotive to SmartTVs, game consoles and the smartphones. Vivante's GPU family have been utilizing OpenCL API for long time and we continue to be in forefront to support this new major API version as it will further improve the flow of getting even more complex things done, much faster and better, said Weijin Dai, CEO of Vivante Corp. Were pleased to equip our customers with our GPUs that are faster, smaller and cooler when we see OpenCL to become a significant standard in our customers' multi-core implementations."
About The Khronos Group
The Khronos Group is an industry consortium creating open standards to enable the authoring and acceleration of parallel computing, graphics, vision, sensor processing and dynamic media on a wide variety of platforms and devices. Khronos standards include OpenGL®, OpenGL® ES, WebGL, OpenCL, WebCL, OpenVX, OpenMAX, OpenVG, OpenSL ES, StreamInput and COLLADA. All Khronos members are able to contribute to the development of Khronos specifications, are empowered to vote at various stages before public deployment, and are able to accelerate the delivery of their cutting-edge media platforms and applications through early access to specification drafts and conformance tests. More information is available at www.khronos.org.