How to setup OpenCL in Cygwin for Intel GPU - gcc

I have a Laptop with Intel(R) HD Graphics 520 GPU in it. I added OpenCL developer package to Cygwin. I have found a small Mandelbrot-set calculator program for OpenCL in C on GitHub. It is for Apple, so I modified the Makefile to use the proper headers and settings for gcc. Now the code compiles and executes nicely (bmp file created):
$ ./mandelbrot.exe
Device 0: GenuineIntel pthread-Intel(R) Core(TM) i5-6300U CPU # 2.40GHz
I have two questions:
#1. How can I add (if it is possible) the Intel GPU to the /etc/OpenCL/vendors list? I tried to install from Intel site the Intel CPU runtime for OpenCL Applications for Windows OS and Intel Graphics Technology driver package, but I do not know where can I find the proper OpenCL dll I can point in the intel.idc file.
#2. In /etc/OpenCL/vendors I have found a pocl.icd file pointing to cygpocl-2.dll. I assume this is the pthread library. But it seems to me it is running only a single thread, although I have 4 cores. Should I do any modification to run it in multiple threads? I debugged the code and it seems that as there are only one device found, so it runs only on one thread. In the initialization function it sets the device_work_size property for processing a stirp per device of the final bmp. But as there is only one device, the whole bmp is processed by one run (one clEnqueueNDRangeKernel and one clEnqueueReadBuffer is called).
UPDATE
I have installed Intel(R) Graphics – Windows* DCH Drivers. It installed graphics drivers. I have found the intelocl64.dll (as /cygdrive/c/Program Files (x86)/Common Files/Intel/Shared Libraries/intel64/intelocl64.dll). I put this whole path into /etc/OpenCL/vendors/intel.icd file. So far, so good. Now it cannot even find pthread device... Bah...

I don't think it is possible to use the GPU from cygwin. I would recommend to either build a native Windows binary (e.g. with Visual Studio or Intel DPC++) or use WSL. See https://github.com/intel/compute-runtime/blob/master/WSL.md for requirements.

Related

SyCL ComputeCpp: issues with the matrix_multiply SDK example

I just managed to install successfully the SyCL ComputeCpp + OpenCL (from CUDA) and running cmake to generate the samples VS2019 sln, successfully.
I've tried to run the matrix_multiply example ONLY, for now.
It ran successfully using the Intel FPGA emulator as a default device.
Changing the devices to the Device CPU worked well as well.
Choosing the host device, took ages without exiting.
When I tried to change the device to the nVidia, the GeForce GTX 1650 Ti.
I got this expection error from there ComputeCpp:RT0100, etc etc.
Googling a bit, I found I'd probably have to output the PTX instead of the SPIR.
So I regenerated the sln using -DCOMPUTECPP_BITCODE=ptx64
After doing that, the kernel ran successfully on the nVidia GPU.
My first question is: is that needed since nVidia does NOT support spir yet at the time of this writing, but only PTX?
However this broke the other devices, which are now reporting:
[ComputeCpp:RT0107] Failed to create program from binary
This happens now for all devices: Intel GPU, Device CPU, Device FPGA (While were formerly working)
Inspecting the .sycl I found now SYCL_matrix_multiply_cpp_bin_nvptx64[].
My question is: how to support nVidia with ptx and "normal" devices with spir altogether in the same exe? I did a menù from which the user can choose to play with, but now it's working only for nVidia.
What am I doing wrong, please?
I would expect to be able to run the same .sycl code for all the devices despite it contains ptx or spir. How to manage for that?
EDIT: I just tried to retarget the bitcode to spirv64, since the computecpp_info told me all my devices are supposed to support it.
However, now no device is anymore working with that setting :-(

Simple bootloader for running Linux kernel on a simulator

We have built a simple instruction set simulator for the sparc v8 processor. The model consists of a v8 processor, a main memory and a character input and a character output device. Currently I am able to run simple user-level programs on this simulator which are built using a cross compiler and placed in the modeled main memory directly.
I am trying to get a linux kernel to run on this simulator by building a simplest bootloader. (I'm considering uClinux which is made for mmu-less systems). The uncompressed kernel and the filesystem are both assumed to be present in the main memory itself, and all that my bootloader has to do is pass the relevant information to the kernel and make a jump to the start of the kernel code. I have no experience in OS development or porting linux.
I have the following questions :
What is this bare minimum information that a bootloader has to supply to the kernel ?
How to pass this information?
How to point the kernel to use my custom input/output devices?
There is some documentation available for porting linux to ARM boards, and from this documentation, it seems that the bootloader passes information about the size of RAM etc
via a data structure called ATAGS. How is it done in the case of a Sparc processor? I could not find much documentation for Sparc on the internet. There exists a linux bootloader for the Leon3 implementation of Sparc v8, but I could not find the specific information I was looking for in its code.
I will be grateful for any links that explain the bare minimum information to be passed to a kernel and how to pass it.
Thanks,
-neha

Offline compilation for AMD and NVIDIA OpenCL Kernels without cards installed

I was trying to figure out a way to perform offline compilation of OpenCL kernels without installing Graphics cards. I have installed the SDK's.
Does anyone has any experience with compiling OpenCL Kernels without having the graphics cards installed for both any one of them NVIDIA or AMD.
I had asked a similar question on AMD forums
(http://devgurus.amd.com/message/1284379).
NVIDIA forums for long are in accessible so couldn't get any help from there.
Thanks
AMD has an OpenCL extension for compiling binaries for devices that are not present on the system. The extension is called cl_amd_offline_devices. Pass the property CL_CONTEXT_OFFLINE_DEVICES_AMD when creating a context and all of AMDs supported devices are reported and can be used to create binaries as if they were present on the system.
Check out their OpenCL programming guide at http://developer.amd.com/tools/hc/AMDAPPSDK/assets/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf for more info
No need to graphic card, you can compile OpenCL programs for CPU too. If you have Intel or AMD CPU this idea works. Download latest OpenCL SDK from corresponding manufacturer website and compile OpenCL program:
Intel OpenCL SDK
AMD APP

cuda nvcc cross compiler

I want to compile CUDA code on mac but make it executable on Windows.
Is there a way to set up an nvcc CUDA cross compiler?
The problem is that my desktop windows will be inaccessible for a while due to traveling, however i do not want to wasted time by waiting til i get back and compile the code. If I have to wait then it would be a waste of time to debug the code and make sure it compiles correct and the likes. My mac is not equipped with cuda capable hardware though.
The short answer, is no, it is not possible.
It is a common misconception, but nvcc isn't actually a compiler. It is a compiler driver, and it relies heavily on the host C++ compiler in order to steer compilation both host and device code. To compile CUDA for Windows, you must using the Microsoft C++ compiler. That compiler can't be run on Linux or OS X, so cross compilation to a Windows target is not possible unless you are doing the compilation on a Windows host (so 32/64 bit cross compilation is possible, for example).
The other two CUDA platforms are equally incompatible, despite requiring gcc for compilation, because the back ends are different (Linux is an elf platform, OS X is a mach platform), so even cross compilation between OS X and Linux isn't possible.
You have two choices if compilation on the OS X platform is the goal
Install the OS X toolkit. Even though your hardware doesn't have a compatible GPU, you can still install the toolkit and compile code.
Install the Windows toolkit and visual studio inside a virtual windows installation (or a physical boot camp installation), and compile code inside Windows on the Mac. Again, you don't need NVIDIA compatible hardware to do this.
If you want to run code without a CUDA GPU, there is a non-commercial (GPU Ocelot) and commercial (PGI CUDA-x86) option you could investigate.

Install AMD OpenCL CPU driver with an Nvidia graphic card

I have seen this question many times but never found an answer for Windows.
I recently ported my CUDA code to OpenCL.
When testing with an ATI card, the Catalyst drivers contain a CPU OpenCL driver, hence I can run the OpenCL code on the CPU.
When testing with an NVIDIA card, there is no driver for the CPU.
Question is: how can I install (and deploy) a CPU driver when running with an Nvidia card?
Thanks a lot
To use OpenCL on CPU you don't need any driver, you only need OpenCL runtime that supports CPU, which (in case of AMD/ATI) is part of APP SDK. It could be installed no matter what GPU you have. Your end-users would also have to install the APP SDK: currently, there is no way to install OpenCL runtime only.
If you have Intel CPU, you better try Intel OpenCL SDK, which has separate installer. However, AMD APP SDK works on Intel CPUs quite well, but note vice versa.

Resources