Open Standard for Parallel Programming of Heterogeneous Systems

OpenCL™ (Open Computing Language) is an open, royalty-free standard for cross-platform, parallel programming of diverse accelerators found in supercomputers, cloud servers, personal computers, mobile devices and embedded platforms. OpenCL greatly improves the speed and responsiveness of a wide spectrum of applications in numerous market categories including professional creative tools, scientific and medical software, vision processing, and neural network training and inferencing.

OpenCL 3.0 Final is Here!

The OpenCL 3.0 Finalized Specification was released on September 30th 2020

Read the Blog about the final release of OpenCL 3.0 Provisional Press Release Provisional Launch Presentation

OpenCL 3.0 realigns the OpenCL roadmap to enable developer-requested functionality to be broadly deployed by hardware vendors, and it significantly increases deployment flexibility by empowering conformant OpenCL implementations to focus on functionality relevant to their target markets. OpenCL 3.0 also integrates subgroup functionality into the core specification, ships with a new unified API and OpenCL C 3.0 language specifications and introduces extensions for asynchronous data copies to enable a new class of embedded processors.

OpenCL 3.0 Materials

Specification SDK OpenCL Guide OpenCL Blogs Issues Discussions Resources Reference Guide

Industry Support for OpenCL

“OpenCL is the most pervasive, cross-vendor, open standard for low-level heterogeneous parallel programming—widely used by applications, libraries, engines, and compilers that need to reach the widest range of diverse processors. OpenCL 2.X delivers significant functionality, but OpenCL 1.2 has proven itself as the baseline needed by all vendors and markets. OpenCL 3.0 integrates tightly organized optionality into the monolithic 2.2 specification, boosting deployment flexibility that will enable OpenCL to raise the bar on pervasively available functionality in future core specifications.”

Neil Trevett

Vice President at NVIDIA, President of the Khronos Group and OpenCL Working Group Chair

“In recent years there has been an impressive adoption of OpenCL to drive heterogeneous processing systems within many market segments. This update to OpenCL 3.0 brings important flexibility benefits that will allow many evolving industries, from AI and HPC to automotive, to focus on their specific requirements and embrace open standards. Codeplay is excited to enable hardware vendors to support OpenCL 3.0 and to take advantage of the flexibility provided in its ecosystem of software products.”

Andrew Richards

Founder and CEO of Codeplay Software

“With its focus on deployment flexibility, we see OpenCL 3.0 as an excellent step forward in providing critical features for developers, with the ability to add functionality over time. This really is a step forward for the OpenCL ecosystem, allowing developers to write portable applications that depend on widely accepted functionality. Currently shipping GPUs based on the PowerVR Rogue architecture will enjoy a significant feature uplift including SVM, Generic Address Space and Work-group Functions. Upon final release of the specification, Imagination will ship a conformant OpenCL 3.0 implementation with support extending across a wide range of PowerVR GPUs, including our latest offering with IMG A-Series.”

Mark Butler

Vice President of Software Engineering, Imagination Technologies

“Intel strongly supports cross-architecture standards being driven across the compute ecosystem such as in OpenCL 3.0 and SYCL. Standards-based, unified programming models will enable efficiency and unleash creativity for our developers with the upcoming release of our new Xe GPU architecture.”

Jeff McVeigh

Vice President, Intel Architecture, Graphics and Software

“NVIDIA welcomes OpenCL 3.0’s focus on defining a baseline to enable developer-critical functionality to be widely adopted in future versions of the specification. NVIDIA will ship a conformant OpenCL 3.0 when the specification is finalized and we are working to define the Vulkan® interop extension that, together with layered OpenCL implementations, will significantly increase deployment flexibility for OpenCL developers.”

Anshuman Bhat

Compute product manager at NVIDIA

“OpenCL 3.0 is an important step forward in the drive to unlock greater performance and innovation across a broadening range of computing platforms and applications. The flexible extension model will help our customers and software partners take full advantage of the tremendous potential available in both our existing and future application processors. We are pleased to have had the opportunity to contribute to this specification and we look forward to supporting the final product.”

Balaji Calidas

Director of Engineering at Qualcomm

“Many of our customers want a GPU programming language that runs on all devices, and with growing deployment in edge computing and mobile, this need is increasing. OpenCL is the only solution for accessing diverse silicon acceleration and many key software stacks use OpenCL/SPIR-V as a backend. We are very happy that OpenCL 3.0 will drive even wider industry adoption, as it reassures our customers that their past and future investments in OpenCL are justified.”

Vincent Hindriksen

Founder and CEO of Stream HPC

“OpenCL 3.0 has opened up a new chapter for the OpenCL API which has served as the standard GPGPU API during the past 10 years. With the streamlined OpenCL 3.0 core feature set, OpenCL 3.0 will enable a whole new class of embedded devices to adopt OpenCL API for GPU Compute and ML/AI processing, and it will also pave the way forward for OpenCL to interop or layer with the Vulkan API. VeriSilicon will deploy OpenCL 3.0 implementations quickly on a broad range of our embedded GPU and VIP products to enable our customers to develop new sets of GPGPU/ML/AI applications with the OpenCL 3.0 API.”

Weijin Dai

Executive Vice President and GM of Intellectual Property Division at VeriSilicon

OpenCL is Widely Deployed and Used

OpenCL for Low-level Parallel Programing

OpenCL speeds applications by offloading their most computationally intensive code onto accelerator processors - or devices. OpenCL developers use C or C++-based kernel languages to code programs that are passed through a device compiler for parallel execution on accelerator devices.

How OpenCL Relates to Other Khronos Parallel Acceleration Standards

OpenCL provides the industry with the lowest 'close-to-metal' processor-agile execution layer for accelerating applications, libraries and engines, and also providing a code generation target for compilers. Unlike 'GPU-only' APIs, such as Vulkan, OpenCL enables use of a diverse range of accelerators including multi-core CPUs, GPUs, DSPs, FPGAs and dedicated hardware such as inferencing engines.

How OpenCL relates to the family of Khronos acceleration standards

OpenCL Deployment Flexibility

As the industry landscape of platforms and devices grows more complex, tools are evolving the enable OpenCL applications to be deployed onto platforms that do not have available native OpenCL drivers. For example, the open source clspv compiler and clvk API translator enable OpenCL applications to be run over a Vulkan run-time. This gives OpenCL developers significant flexibility on where and how they can deploy their OpenCL applications.

Open source software tools enable OpenCL kernels to be executed over multiple target APIs

OpenCL Programming Model

An OpenCL application is split into host and device parts with host code written using a general programming language such as C or C++ and compiled by a conventional compiler for execution on a host CPU.

The device compilation phase can be done online, i.e. during execution of an application using special API calls. It can alternatively be compiled before executing the application into the machine binary or special portable intermediate representation defined by Khronos called SPIR-V. There are also domain specific languages and frameworks that can compile to OpenCL either using source-to-source translations or generating binary/SPIR-V, for example Halide.

Traditional vs OpenCL programming paradigm

Application host code is frequently written in C or C++ but bindings for other languages are also available, such as Python. Kernel programs can be written in a dialect of C (OpenCL C) or C++ (C++ for OpenCL) that enables a developer to program computationally intensive parts of their application in a kernel program. All versions of the OpenCL C language are based on C99. The community driven C++ for OpenCL language brings together capabilities of OpenCL and C++17.

C++ for OpenCL Kernel Language

The OpenCL working group has transitioned from the original OpenCL C++ kernel language first defined in OpenCL 2.0 to C++ for OpenCL developed by the open source community to provide improved features and compatibility with OpenCL C. C++ for OpenCL is supported by Clang and its documentation can be found here. It enables developers to use most C++17 features in OpenCL kernels. It is largely backwards compatible with OpenCL C 2.0 enabling it to be used to program accelerators with OpenCL 2.0 or above with conformant drivers that support SPIR-V. Its implementation in Clang can be tracked via the OpenCL Support Page.

Kernel Language Extensions

Some extensions are available to the existing published kernel language standards. The full list of such extensions is documented here . Conformant compilers and drivers may optionally support the extensions and so there is a mechanism to detect their support at the compile time. Developers should be aware that not all extensions may be supported across all devices.

Conformant OpenCL Implementations

Here you can view a list of hardware vendors with Conformant OpenCL Implementations

Related Discussions

Related Press

More Press Releases