Proper way of compiling OpenCL applications and using available compiler options - gcc

I am a newbie in OpenCL stuffs.
Whats is the best way to compiler an OpenCL project ?
Using a supported compiler (GCC or Clang):
When we use a compiler
like gcc or clang, how do we control these options? Are they
have to be set inside the source code, or, likewise the normal
compilation flow we can pass them on the command line. Looking at the Khornos-Manual-1.2, there are a few options provided for cl_int clBuildProgram for optimizations. :
gcc|clang -O3 -I<INCLUDES> OpenCL_app.c -framework OpenCL OPTION -lm
Actually, I Tried this and received an error :
gcc: error: unrecognized command line option '<OPTION>'
Alternatively, using openclc:
I have seen people using openclc to compiler using
a Makefile.
I would like to know which is the best way (if
there are actually two separate ways), and how do we control the
usage of different compile time options.

You might be aware but it is important to reiterate. OpenCL standard contains two things:
OpenCL C language and programming model (I think recent standard include some C++)
OpenCL host library to manage device
gcc and clang are compilers for the host side of your OpenCL project. So there will be no way to provide compiler options for OpenCL device code compilations using a host compiler since they are not even aware of any OpenCL.
Except with clang there is a flag that accept OpenCL device code, .cl file which contains the kernels. That way you can use clang and provide also the flags and options if I remember correctly, but now you would have either llvm IR or SPIR output not an device executable object. You can then load SPIR object to a device using device's run-time environment(opencl drivers).
You can checkout these links:
Using Clang to compile kernels
Llvm IR generation
SPIR
Other alternative is to use the tools provided by your target platform. Each vendor that claims to support opencl, should have a run-time environment. Usually, they have separate CLI tools to compile OpenCL device code. In you case(I guess) you have drivers from Apple, therefore you have openclc.
Intel CLI as an example
Now to your main question (best way to compile opencl). It depends what you want to do. You didn't specify what kind of requirements you have so I had to speculate.
If you want to have off-line compilation without a host program, the considerations above will help you. Otherwise, you have to use OpenCL library and have on-line compilation for you kernels, this is generally preferred for products that needs portability. Since if you compile all your kernels at the start of your program, you directly use the provided environment and you don't need to provide libraries for each target platform.
Therefore, if you have an OpenCL project, you have to decide how to compile. If you really want to use the generic flags and do not rely on third party tools. I suggest you to have a class that builds your kernels and provides the flags you want.

...how do we control these options? Are they have to be set inside the source code, or, likewise the normal compilation flow we can pass them on the command line.
Options can be set inside the source code. For example:
const char options[] = "-cl-finite-math-only -cl-no-signed-zeros";
/* Build program */
err = clBuildProgram(program, 1, &device, options, NULL, NULL);
I have never seen opencl options being specified at the command line and I'm unaware whether this is possible or not.

Related

Is it possible to use the FTD2XX.lib (x64 static) library with MSYS2 application

I have a Windows application written using MSYS2 and I need to statically link the FTD2XX library in order to access some FTDI devices.
After much testing and Googling I have failed to be able to have the GCC linker use the FTD2XX.lib library.
Is it possible to use the FTD2XX library from FTDI with MSYS2 (GCC) compiler and linker?
I don't know anything about your compiler, but I use the D2XX libraries all the time running the Lahey/Fujitsu FORTRAN compiler. All I need to do is wrapper the paramaeters for C-calls (i.e. explicitly tell the compiler when to pass by value and when to pass by pointer...FORTRAN always passes by pointer).
I would suggest looking at your settings for parameter passing, and make sure that everything matches up. The D2XX manual is very good at explicitly telling you the expected format for the library calls. Good luck.

Portable method to package C++11 program sources

so, C++11 has been around for a while and, given there already are compilers supporting it on most platforms, it would be nice to use it in some real software -- e.g. one that can be packaged in as-portable-as-possible package, preferably providing ./configure and so.
Because both Clang and GCC currently need -std=c++11 flag to compile c++11 source, and both sometimes require specific flags to work correctly (see for example How to compile C++11 with clang 3.2 on OSX lion? or C++11 Thread not working ), I'm quite afraid that the package won't work on some platforms that already support c++11 because of wrong invocation of compiler.
Q: Is there some standard how to correctly and portably compile c++11? E.g. autotools/autoconf check or some list of compiler/platform directives that describe all possible needed options? Or does the situation come from the fact that c++11 standard implementations are currently marked as "experimental" and the standard will eventually stabilize and become the default choice, not needing any usage of extra compiler flags?
Thanks
-exa
Well, if you`re trying to write portable code, i would recommend using cmake
a very powerful cross-platform, open-source build system.
Using cmake you should be able to identify the compilers available in your current machine and then generate your makefiles using the flags that you want in each case.
I have been using cmake for almost a year by now and it has significantly reduced the time consumed when trying to get a project compiling in different platforms.
I`m using CMake to generate Makefiles of C++11 projects. The only change in CMakeLists.txt I need to do is add the following:
ADD_DEFINITIONS("-std=gnu++11")
ADD_DEFINITIONS("-D_GLIBCXX_USE_C99_STDINT_TR1")
ADD_DEFINITIONS("-D_GLIBCXX_HAS_GTHREADS")
However, as I use Qt, I re-compile QtSDK with a new gcc version 4.8 and get a complete mingw system that use gcc in version 4.8.
Makings these changes, the project compile and run in Windows XP, Windows 7 and linux both 32 and 64 bits. I didn`t test it in OSX yet.

ARM and GCC Compiling

Hopefully this hasn't been asked and answered already, but I just had a quick question on ARM.
Specifically, if when compiling Android (which has a lot of C and C++), you use GCC to compile, doesn't that create x86 based code? How is it that an ARM processor, which uses a reduced instruction set, can interpret this code and run like it does?
Thanks!
GCC doesn't just compile for x86. It actually compiles to any instruction set. If you wanted to you could create a new one just by adding a few files.
And ARM isn't a reduced instruction set. Its a completely different instruction set. There's some things ARM has that x86 doesn't and vice versa.
Building gcc goes through a configuration step, part of this is to specify a back-end. The back-end is responsible for op-code generation. The typical compiler is many phases. Briefly,
Parser - convert text to a data representation.
Front end - Optimize by changing code constructs, possibly language specific.
Middle end - Performs computer science optimization that are common to any compiler.
Back end - Performs optimization specific to the target CPU.
See stackoverflow compiler wiki for more.
So parts one to three are common for the x86 and the ARM versions of gcc (or any gcc). The Android compiler is a version of gcc which has been configured to generate ARM code. It is a different compiler than the one that normally runs on an x86. You maybe running an ARM emulator on a PC and then believe that this code is run by the x86. However, this is a virtual ARM machine running this code. An x86 processor can not run ARM code natively.
The Android gcc is an ARM configured gcc. A normal Linux distributions gcc is configured for an x86 or x86_64.
Something is missing above: Who compiles the compiler? In both cases, an x86 compiler compile the new compiler. The difference is the selected back-end. One is x86, the other ARM. Both compilers run on an x86, but they generate code for different targets. Gcc can only generate code for an ARM or an x86; never both via any sort of command line switch. A compiler build usually refers to three different CPU types.
Build - Machine where the compiler is built. This is the compiler's compiler.
Host - the machine the compiler runs on. Not it's output, but the compiler itself.
Target - the machine the back-end targets. The one code is generated for.
I think maybe people are thinking because they both run on the same host, they must generate code for the same target. But that is not true; it is a little mind bending at first. Depending on the setup, you may need compilers for each of these machines to make a final compiler.
The first compiler for any machine is usually a cross compiler. Except for some people who made primitive compilers long ago in assembler.
See also: Cross compiler.
to put it simply, when you're building for ARM on your x86 computer you're using a cross-compiler - a compiler that runs on one platform but generates code for another. This is extremely common when developing for embedded or mobile platforms.

What compilers support CUDA

I found some problem with Visual Studio. My project that use openMP multithreading was twice slow on Visual Studio 2010, than on Dev-C++ , Now I wrote my other project that uses CUDA technology , I think that my project works slow because of Visual Studio, so I need some other compiler that will support CUDA , my questions are:
is Dev-C++ support CUDA?
what compilers support CUDA except Visual Studio?
if there are a lot compilers supporting CUDA what will give best speed for application?
The CUDA Toolkit Release Notes list the supported platforms and compilers.
Well I think it's the other way around. The thing is, there is a driver called nvcc. it generates device code and host code and sends the host code to a compiler. It should be a C compiler and it should be in the executable path. (EDIT: and it should be gcc on Linux and cl on Windows and I think I should ignore mac as the release note did(?))
nvcc Compiler Info reads:
A general purpose C compiler is needed by nvcc in the following
situations:
During non-CUDA phases (except the run phase), because these phases will be forwarded by nvcc to this compiler
During CUDA phases, for several preprocessing stages (see also 0). On Linux platforms, the compiler is assumed to be ‘gcc’, or ‘g++’ for linking. On Windows platforms, the compiler is assumed to be ‘cl’. The
compiler executables are expected to be in the current executable
search path, unless option -compiler-bin-dir is specified, in which
case the value of this option must be the name of the directory in
which these compiler executables reside.
And please don't talk like that about compilers. Your code is in a way that works better with Dev-C++. What is generated is an assembly code. I don't say that they don't make any difference, but maybe 4 to 5%, not 100%.
And absolutely definitely don't blame the compiler for your slow program. It is definitely because of inefficient memory access and incorrect use of different types of memory.

Performance comparison between Windows gcc compiled & Visual Studio compiled

I'm currently compiling an open source optimization library (native C++) supplied with makefiles for use with gcc. As I am a Windows user, I'm curious on the two options I see of compiling this, using gcc with MinGW/Cygwin or manually building a Visual Studio project and compiling the source.
1) If I compile using MinGW/Cygwin + gcc, will the resulting .lib (static library) require any libraries from MinGW/Cygwin? I.e. can I distribute my compiled .lib to a Windows PC that doesn't have MinGW/Cygwin and will it still run?
2) Other than performance differences between the compilers themselves, is there an overhead associated when compiling using MinGW/Cygwin and gcc - as in does the emulation layer get compiled into the library, or does gcc build a native Windows library?
3) If speed is my primary objective of the library, which is the best method to use? I realise this is quite open ended, and I may be best running my own benchmarks, but if someone has experience here this would be great!
The whole point of Cygwin is the Linux emulation layer, and by default (ie if you don't cross-compile), binaries need cygwin1.dll to run.
This is not the case for MinGW, which creates binaries as 'native' as the ones from MSVC. However, MinGW comes with its own set of runtime libraries, in particular libstdc++-6.dll. This library can also be linked statically by using -static-libstdc++, in which case you also probably want to compile with -static-libgcc.
This does not mean that you can freely mix C++ libraries from different compilers (see this page on mingw.org). If you do not want to restrict yourself to an extern "C" interface to your library, you most likely will have to choose a single compiler and stick with it.
As to your performance concerns: Using Cygwin only causes a (minor?) penalty when actually interacting with the OS - where raw computations are concerned, only the quality of the optimizer matters.

Resources