Characterizing OpenCL 1.2 support in recent nVIDIA drivers - interop

Starting April 2015, nVIDIA drivers report that OpenCL 1.2 is supported in their GPUs (at least on Kepler and Maxwell).
I have not seriously used OpenCL on nVIDIA GPUs yet (just toyed with it a bit). I remember people reporting poor OpenCL support, e.g. no support for events; no support for providing sources in SPIR/SPIR-V; and so on.
What has actually improved w.r.t. OpenCL support? And what are the significant parts missing that are actually supported in CUDA (in some alternate form)?
PS - Here are the OpenCL 1.2 extensions listed with CUDA 8.0 for my Kepler/Maxwell cards:
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing
cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll
cl_nv_copy_opts

In 2015 and 2016 NVIDIA have stepped up their OpenCL support. Modern NVIDIA hardware supports OpenCL 1.2, and at GTC 2016 they announced that pieces of OpenCL 2.0 will appear later this year (not all of it, just a few things). Within the last year their OpenCL driver performance has improved markedly; I have measured improvements like 1.4x in profiling tests.

Related

Does NVidia support OpenCL SPIR?

I am wondering that whether nvidia supports spir backend or not? if yes, i couldn't find any document and sample example about that. but if not, is there a any way to work spir backend onto nvidia gpus?
thanks in advance
Since SPIR builds on top of OpenCL version 1.2, and so far Nvidia has not made any OpenCL 1.2 drivers available, it is not possible to use SPIR with Nvidia GPUs. As mentioned in the comments, Nvidia has made PTX available as intermediate language (also based on LLVM IR). One could consider translating SPIR into PTX but I don't know how realistic that would be.
Other vendors such as AMD and Intel are already showing support for SPIR. This can be verified by querying the CL_DEVICE_EXTENSIONS with the clGetDeviceInfo OpenCL API. If the result string contains cl_khr_spir, the driver supports SPIR.

CUDA development Mac Mini

I'm very new in CUDA development and I don't know how to start. I have a Mac Mini
With a OS X 10.8.3 and XCode 4.6.3. Can I use it for CUDA development?
You will not be able to use CUDA, as you do not have an NVidia graphics card. You might want to look into OpenCL which is supported on Intel CPUs and GPUs (as well as AMD and NVidia GPUs). OpenCL is similar to the CUDA Driver API and therefore what you learn programming in OpenCL will help with learning CUDA as well since many of the concepts are similar.
For OpenCL on Mac, check out https://developer.apple.com/library/mac/#documentation/Performance/Conceptual/OpenCL_MacProgGuide/Introduction/Introduction.html.
You can do CUDA development on some Mac Minis.
You, of course, need a model with an Nvidia GPU. I'm doing it on my 2010 Mini which has a NVIDIA GeForce 320M 256 MB. This is a lightweight chip. It only has 48 CUDA cores, but it is good for educational use.
You can get info on the software you need here:
https://cseweb.ucsd.edu/classes/fa12/cse260-b/RoomFullOfMacMinis.txt

OpenCL maturity under Windows

I consider using OpenCL in a consumer product which is currently under development.
Doing a small research I found that generally there is good support under Mac OSX. Linux support is also relatively good, but my target audience does not use Linux. It remains to check how well it is supported in Windows.
Regarding Windows I found OpenCL distribution which raises some concerns.
Do any of you have any experience with using OpenCL in consumer-oriented products under Windows? I am more interested in the GPU side of OpenCL, specifically driver support.
Just like CUDA or Stream, OpenCL needs to be supported by the driver. Most CUDA-capable GPUs support OpenCL with a somewhat up-to-date driver (CUDA 1.0 upwards).
In fact, if you compile with, say, CUDA SDK 4.1 your end users will need newer drivers than if you had used OpenCL.
Also, OpenCL is not bound to any GPU architecture. While this might be problematic for specifically designed algorithms, it shouldn't have a very high impact on normal end user programs.
At least with CUDA, you can only compile code optimized for the current known major version. Compiling OpenCL kernels on the end user machine might allow optimizations for newer binary specifications in the future.
The crashes the author in that questions reported for Nvidia OpenCL generally seem to happen a lot if resources are not freed properly. I've been seeing similar crashes until I fixed a leak that didn't release created kernels.
I'm not saying it's the only reason why it might crash, but apart from programmer errors it appears fairly stable to me.
AMD and NVidia both support OpenCL on most (all?) of their GPUs
Unfortunately Intel only supports it on the CPU which is a bit pointless and if you have to insist that the user has a separate GPU for your app you can also insist that they have an NVidia one and use CUDA. This has limited the uptake of OpenCL.

NVIDIA Parallel Nsight and OpenCL

I'm little confuse with NVIDIA Parallel Nsight and OpenCL, can anyone confirm me that it is possible to debug OpenCL code using NVIDIA Parallel Nsight 1.5 or 2.0RC?
Currently it is not possible to debug OpenCL kernels with Parallel Nsight yet. Parallel Nsight 2.0 (the latest release as of Jun 2011) only supports the debugging of CUDA kernel. However OpenCL debugging is one of the features that is likely to go into the product in future releases.
Yes, it is possible, I've did it my self, the only problem is that you will need two computers connected to network, having two identical video cards. One will be executing your kernel step by step(due to this fact the graphical adapter won't be able to display results, the display will stall) this is where the second computer comes in to play, it displays results in Visual Studio like you were debugging ordinary program.
Personally I found NVIDIA Parallel Nsight as a useless tool. Any kernel debugging can be done via adding additional argument to a kernel and outputting any subject data there.
Parallel Nsight 2.1 include API for trace OpenCL 1.1 now
#see http://nvidia.com/object/parallel-nsight.html

List of OpenCL compliant CPU/GPU

How can I know which CPU can be programmed by OpenCL?
For example, the Pentium E5200.
Is there a way to know w/o running and querying it?
OpenCL compatibility can generally be determined by looking on the vendor's sites. AMD's APP SDK requires CPUs to support at least SSE2. They also have a list of currently supported ATI/AMD video cards.
The most official source is probably the Khronos conformance list:
http://www.khronos.org/conformance/adopters/conformant-products#opencl
For compatibility with the AMD APP SDK: http://developer.amd.com/gpu/AMDAPPSDK/pages/DriverCompatibility.aspx
For the NVIDIA, anything that supports CUDA should support their implementation of OpenCL:
http://www.nvidia.com/object/cuda_gpus.html
For compatibility with the Intel OpenCL SDK, look at:
https://software.intel.com/en-us/articles/opencl-code-builder-release-notes
Here is the list of conforming OpenCL products from the Khronos site:
http://www.khronos.org/conformance/adopters/conformant-products/
You got Intel OpenCL too http://software.intel.com/en-us/articles/intel-opencl-sdk/ for windows right now.
Just one more comment about Intel, Now they do not only support OpenCL under windows, but also linux. But it is part of a commercial SDK see https://software.intel.com/en-us/intel-media-server-studio.
Another alternative for OpenCL development under Linux is Beignet, an OpenCL source project maintain by Intel China.
http://www.freedesktop.org/wiki/Software/Beignet/
I have tested on linux and it works as per tutorial, however, the compiler they use is completely different from the one under the windows.
Well for the CPU, AMD's SDK is supposed to work on x86 (even on Intel's x86), so that will cover most of your options.
And for the GPU, I think almost all cards made in the last couple of years should run OpenCL kernels. I don't have of a particular list.
EDIT: Looks like AMD removed the original SDK pages with no replacement. There are unofficial mirrors for Windows and Linux, but I haven't tried them.

Resources