NVIDIA Parallel Nsight and OpenCL - debugging

I'm little confuse with NVIDIA Parallel Nsight and OpenCL, can anyone confirm me that it is possible to debug OpenCL code using NVIDIA Parallel Nsight 1.5 or 2.0RC?

Currently it is not possible to debug OpenCL kernels with Parallel Nsight yet. Parallel Nsight 2.0 (the latest release as of Jun 2011) only supports the debugging of CUDA kernel. However OpenCL debugging is one of the features that is likely to go into the product in future releases.

Yes, it is possible, I've did it my self, the only problem is that you will need two computers connected to network, having two identical video cards. One will be executing your kernel step by step(due to this fact the graphical adapter won't be able to display results, the display will stall) this is where the second computer comes in to play, it displays results in Visual Studio like you were debugging ordinary program.
Personally I found NVIDIA Parallel Nsight as a useless tool. Any kernel debugging can be done via adding additional argument to a kernel and outputting any subject data there.

Parallel Nsight 2.1 include API for trace OpenCL 1.1 now
#see http://nvidia.com/object/parallel-nsight.html

Related

Profiling OpenGL ES in Windows

I'm trying to do some profiling on my OpenGL ES code. Somewhere in my GPU pipeline (a shader I believe) is causing a huge delay. Which is the best profiler I can use? Is this one a good option? is there one I can use directly within Visual Studio?
If you have a GPU performance issue on IOS, the best is to use XCode tools to profile it directly on device, running the app from Xcode and then doing a frame capture to look at the timings for each draw call / the number of cycles used by each shader (more info here)
You can also profile on Windows if you are also able to simulate your graphics pipeline in classic OpenGL in your Windows version, but this may not be a good idea as the iPhone's GPU is very different than a classic desktop GPU so the bottleneck might not be the same on Windows than on IOS.
To profile on Windows I would suggest using either Nvidia PerfKit (if you have a Nvidia card) or AMD's GPU PerfStudio if you have an AMD card.
There is also RenderDoc which is a nice tool but not sure if it provides much profiling information (it is more for debugging graphics issues than profiling)

Windows environment OpenACC

I would like to start developping OpenACC program and I have few questions to ask :
Is it possible to execute OpenACC code on AMD gpu ?
If so, I'm looking for a compiler available for windows environment. I spent like hour to find nothing, I'm going desperate to find anything that could allow me to compile Openacc directive.
Yes, there's a few compilers that support AMD devices. You can see the targets offered by PGI at: http://www.pgroup.com/resources/accel.htm#targets. This includes several AMD Radeon devices.
I believe Pathscale also targets AMD devices (http://www.pathscale.com/) but I'm not sure if they have a Windows compiler available. Please contact them directly for more information.
Hope this helps,
Mat

OpenCL maturity under Windows

I consider using OpenCL in a consumer product which is currently under development.
Doing a small research I found that generally there is good support under Mac OSX. Linux support is also relatively good, but my target audience does not use Linux. It remains to check how well it is supported in Windows.
Regarding Windows I found OpenCL distribution which raises some concerns.
Do any of you have any experience with using OpenCL in consumer-oriented products under Windows? I am more interested in the GPU side of OpenCL, specifically driver support.
Just like CUDA or Stream, OpenCL needs to be supported by the driver. Most CUDA-capable GPUs support OpenCL with a somewhat up-to-date driver (CUDA 1.0 upwards).
In fact, if you compile with, say, CUDA SDK 4.1 your end users will need newer drivers than if you had used OpenCL.
Also, OpenCL is not bound to any GPU architecture. While this might be problematic for specifically designed algorithms, it shouldn't have a very high impact on normal end user programs.
At least with CUDA, you can only compile code optimized for the current known major version. Compiling OpenCL kernels on the end user machine might allow optimizations for newer binary specifications in the future.
The crashes the author in that questions reported for Nvidia OpenCL generally seem to happen a lot if resources are not freed properly. I've been seeing similar crashes until I fixed a leak that didn't release created kernels.
I'm not saying it's the only reason why it might crash, but apart from programmer errors it appears fairly stable to me.
AMD and NVidia both support OpenCL on most (all?) of their GPUs
Unfortunately Intel only supports it on the CPU which is a bit pointless and if you have to insist that the user has a separate GPU for your app you can also insist that they have an NVidia one and use CUDA. This has limited the uptake of OpenCL.

How can I debug an OpenCL kernel in Xcode 4.1?

I have some OpenCL kernels that aren't doing what they should be, and I would love to debug them in Xcode. Is this possible?
If not, is there any way I can use printf() in my CPU-based kernels? When I use printf() in my kernels the OpenCL compiler always gives me a whole bunch of errors.
Casting the format string to const char * appears to fix this problem.
This works for me on Lion:
printf((char const *)"%d %d\n", dl, dll);
This has the error described above:
printf("%d %d\n", dl, dll);
Have you tried adding this pragma to enable printf?
#pragma OPENCL EXTENSION cl_amd_printf : enable
You might also want to try using Quartz Composer to test out your kernels. If you have access to the WWDC 2010 videos, I believe they show how to use Quartz Composer for rapid prototyping of OpenCL kernels in Sessions 416: "Harnessing OpenCL in Your Application" or 418: "Maximizing OpenCL Performance". There were also some good sessions on this during WWDC 2009 and 2008 that might also be available via ADC on iTunes.
Using Quartz Composer, you can quickly set up inputs and outputs for a kernel, then monitor the results in realtime. You can avoid the change-compile-test cycle because everything is compiled as you type. Syntax errors and the like will pop up as you change code, which makes it fairly easy to identify those.
I've used this tool to develop and test out OpenGL shaders, which have many things in common with OpenCL kernels.
Have you given the gDEBugger a try already? I think it's the only choice you have currently, for OpenCL debugging on the Mac.
Intel offers a printf in their new OpenCL 1.1 SDK, but that's only for Linux and Windows. Lion has OpenCL 1.1, but at least my Core 2 Duo does not support the printf extension.
AMD ist still developing their OpenCL tools, and the Nvidia Debugging tools are only for CUDA, as far as I understand.

List of OpenCL compliant CPU/GPU

How can I know which CPU can be programmed by OpenCL?
For example, the Pentium E5200.
Is there a way to know w/o running and querying it?
OpenCL compatibility can generally be determined by looking on the vendor's sites. AMD's APP SDK requires CPUs to support at least SSE2. They also have a list of currently supported ATI/AMD video cards.
The most official source is probably the Khronos conformance list:
http://www.khronos.org/conformance/adopters/conformant-products#opencl
For compatibility with the AMD APP SDK: http://developer.amd.com/gpu/AMDAPPSDK/pages/DriverCompatibility.aspx
For the NVIDIA, anything that supports CUDA should support their implementation of OpenCL:
http://www.nvidia.com/object/cuda_gpus.html
For compatibility with the Intel OpenCL SDK, look at:
https://software.intel.com/en-us/articles/opencl-code-builder-release-notes
Here is the list of conforming OpenCL products from the Khronos site:
http://www.khronos.org/conformance/adopters/conformant-products/
You got Intel OpenCL too http://software.intel.com/en-us/articles/intel-opencl-sdk/ for windows right now.
Just one more comment about Intel, Now they do not only support OpenCL under windows, but also linux. But it is part of a commercial SDK see https://software.intel.com/en-us/intel-media-server-studio.
Another alternative for OpenCL development under Linux is Beignet, an OpenCL source project maintain by Intel China.
http://www.freedesktop.org/wiki/Software/Beignet/
I have tested on linux and it works as per tutorial, however, the compiler they use is completely different from the one under the windows.
Well for the CPU, AMD's SDK is supposed to work on x86 (even on Intel's x86), so that will cover most of your options.
And for the GPU, I think almost all cards made in the last couple of years should run OpenCL kernels. I don't have of a particular list.
EDIT: Looks like AMD removed the original SDK pages with no replacement. There are unofficial mirrors for Windows and Linux, but I haven't tried them.

Resources