What compilers support CUDA - visual-studio-2010

I found some problem with Visual Studio. My project that use openMP multithreading was twice slow on Visual Studio 2010, than on Dev-C++ , Now I wrote my other project that uses CUDA technology , I think that my project works slow because of Visual Studio, so I need some other compiler that will support CUDA , my questions are:
is Dev-C++ support CUDA?
what compilers support CUDA except Visual Studio?
if there are a lot compilers supporting CUDA what will give best speed for application?

The CUDA Toolkit Release Notes list the supported platforms and compilers.

Well I think it's the other way around. The thing is, there is a driver called nvcc. it generates device code and host code and sends the host code to a compiler. It should be a C compiler and it should be in the executable path. (EDIT: and it should be gcc on Linux and cl on Windows and I think I should ignore mac as the release note did(?))
nvcc Compiler Info reads:
A general purpose C compiler is needed by nvcc in the following
situations:
During non-CUDA phases (except the run phase), because these phases will be forwarded by nvcc to this compiler
During CUDA phases, for several preprocessing stages (see also 0). On Linux platforms, the compiler is assumed to be ‘gcc’, or ‘g++’ for linking. On Windows platforms, the compiler is assumed to be ‘cl’. The
compiler executables are expected to be in the current executable
search path, unless option -compiler-bin-dir is specified, in which
case the value of this option must be the name of the directory in
which these compiler executables reside.
And please don't talk like that about compilers. Your code is in a way that works better with Dev-C++. What is generated is an assembly code. I don't say that they don't make any difference, but maybe 4 to 5%, not 100%.
And absolutely definitely don't blame the compiler for your slow program. It is definitely because of inefficient memory access and incorrect use of different types of memory.

Related

Clang or GCC compiler for c++ 11 compatibility programming on Windows?

I was wondering which compiler is better to use on Windows OS (8.1) in temrs of compatibility to c++11's (and later 14) functions, liberies and features (like lambdas) and is also comfortable to use (less bugs).
I am a university student hence I'm not looking at the subject product-wise (even though I do like to code a bit more than just projects for my studies).
I am currently using eclipse luna IDE if it matters.
Notice that compiler != IDE.
VC++ is one of the most populars on Windows and depending on its version it has a good support for C++11 features. Check the list on the msdn blog to find out if there's everything you need.
Gcc is also ported to Windows and you can install MinGW to use it (4.8.1.4 at the moment of writing this). It is pretty complete on C++11.
Clang is also available for the Windows platform and it is also complete on C++11 support (plus it has good diagnostic messages), but notice that you will have to use another linker since clang doesn't ship with one (although there is an ongoing effort to write it: http://lld.llvm.org/)
All the compilers I cited above are pretty stable but, based on my experience, if you're looking for latest and greatest C++11/14/17 features, you might just want to go for gcc or clang (VC++ is slower in adding support for newest features and the compiler is undergoing a huge update to modernize). Just keep in mind that these are compilers and not just IDEs, an IDE is a front-end supporting program that uses a compiler undercover to compile files.
To set up a C++11 compiler, I suggest installing MSYS2, it has a package manager (pacman) that can install fresh versions of GCC, GDB, Clang and many libraries like SDL, Lua etc. Very easy to use too.
As far as GCC vs CLang goes - I really tried hard to make CLang work (which is presumably faster and more friendly than GCC - produces better warnings, etc.), but failed. Issues were that CLang (which comes with MSYS2) is hard-coded to use GCC linker which produces some strange linker errors when using libstdc++ (std implementation from GCC). libc++ (a new implementation designed to work with CLang) didn't worked for me on Windows either.
So you either try build CLang from sources and hope that some configuration will work with C++11 library, OR just stick with GCC which works just fine out of the box.
As IDE, I suggest to take a look at CLion. It is very comfortable (infinitely more user-friendly and intuitive than Visual Studio, IMO). Just install it and point it to the mingw64 (or mingw32) folder of MSYS2, it will auto-detect everything for you.
It only works with CMake projects though.

Effect of visual studio compiler settings on performance of CUDA kernels

I get about 3-4x times difference in computation time of a same CUDA kernel compiled on two different machines. Both versions run on a same machine and GPU device. The direct conclusion explaining the difference is different compiler settings. Although there is no single perfect setting and the tuning should be customized depending on the kernel, I wonder if there is any clear guideline for helping to choose the right settings. I use Visual Studio 2010. Thank you.
Compile in release mode, not debug mode, if you want fastest performance. The -G switch passed to the nvcc compiler will usually have a negative effect on GPU code performance.
It's generally recommended to select the right architecture for the GPU you are compiling for. For example, if you have a cc 2.1 capability GPU, make sure that setting (sm_21, in GPU code settings) is being passed to the compiler. There are some counter examples to this (e.g. compiling for cc 2.0 seems to run faster, etc.) but as a general recommendation, it is best.
Use the latest version of CUDA (compiler). This is especially important when using GPU libraries (CUFFT, CUBLAS, etc.) (yes, this is not really a compiler setting)

ARM and GCC Compiling

Hopefully this hasn't been asked and answered already, but I just had a quick question on ARM.
Specifically, if when compiling Android (which has a lot of C and C++), you use GCC to compile, doesn't that create x86 based code? How is it that an ARM processor, which uses a reduced instruction set, can interpret this code and run like it does?
Thanks!
GCC doesn't just compile for x86. It actually compiles to any instruction set. If you wanted to you could create a new one just by adding a few files.
And ARM isn't a reduced instruction set. Its a completely different instruction set. There's some things ARM has that x86 doesn't and vice versa.
Building gcc goes through a configuration step, part of this is to specify a back-end. The back-end is responsible for op-code generation. The typical compiler is many phases. Briefly,
Parser - convert text to a data representation.
Front end - Optimize by changing code constructs, possibly language specific.
Middle end - Performs computer science optimization that are common to any compiler.
Back end - Performs optimization specific to the target CPU.
See stackoverflow compiler wiki for more.
So parts one to three are common for the x86 and the ARM versions of gcc (or any gcc). The Android compiler is a version of gcc which has been configured to generate ARM code. It is a different compiler than the one that normally runs on an x86. You maybe running an ARM emulator on a PC and then believe that this code is run by the x86. However, this is a virtual ARM machine running this code. An x86 processor can not run ARM code natively.
The Android gcc is an ARM configured gcc. A normal Linux distributions gcc is configured for an x86 or x86_64.
Something is missing above: Who compiles the compiler? In both cases, an x86 compiler compile the new compiler. The difference is the selected back-end. One is x86, the other ARM. Both compilers run on an x86, but they generate code for different targets. Gcc can only generate code for an ARM or an x86; never both via any sort of command line switch. A compiler build usually refers to three different CPU types.
Build - Machine where the compiler is built. This is the compiler's compiler.
Host - the machine the compiler runs on. Not it's output, but the compiler itself.
Target - the machine the back-end targets. The one code is generated for.
I think maybe people are thinking because they both run on the same host, they must generate code for the same target. But that is not true; it is a little mind bending at first. Depending on the setup, you may need compilers for each of these machines to make a final compiler.
The first compiler for any machine is usually a cross compiler. Except for some people who made primitive compilers long ago in assembler.
See also: Cross compiler.
to put it simply, when you're building for ARM on your x86 computer you're using a cross-compiler - a compiler that runs on one platform but generates code for another. This is extremely common when developing for embedded or mobile platforms.

Performance comparison between Windows gcc compiled & Visual Studio compiled

I'm currently compiling an open source optimization library (native C++) supplied with makefiles for use with gcc. As I am a Windows user, I'm curious on the two options I see of compiling this, using gcc with MinGW/Cygwin or manually building a Visual Studio project and compiling the source.
1) If I compile using MinGW/Cygwin + gcc, will the resulting .lib (static library) require any libraries from MinGW/Cygwin? I.e. can I distribute my compiled .lib to a Windows PC that doesn't have MinGW/Cygwin and will it still run?
2) Other than performance differences between the compilers themselves, is there an overhead associated when compiling using MinGW/Cygwin and gcc - as in does the emulation layer get compiled into the library, or does gcc build a native Windows library?
3) If speed is my primary objective of the library, which is the best method to use? I realise this is quite open ended, and I may be best running my own benchmarks, but if someone has experience here this would be great!
The whole point of Cygwin is the Linux emulation layer, and by default (ie if you don't cross-compile), binaries need cygwin1.dll to run.
This is not the case for MinGW, which creates binaries as 'native' as the ones from MSVC. However, MinGW comes with its own set of runtime libraries, in particular libstdc++-6.dll. This library can also be linked statically by using -static-libstdc++, in which case you also probably want to compile with -static-libgcc.
This does not mean that you can freely mix C++ libraries from different compilers (see this page on mingw.org). If you do not want to restrict yourself to an extern "C" interface to your library, you most likely will have to choose a single compiler and stick with it.
As to your performance concerns: Using Cygwin only causes a (minor?) penalty when actually interacting with the OS - where raw computations are concerned, only the quality of the optimizer matters.

GCC Produced Dlls Not Compatible with Intel Visual Fortran?

I used gcc to compile a few fortran source files into *.lib and *.dll on Windows platform, using the latest version of mingw . The gcc used is version 3. The result of the output is arpack_win32.dll, blas_win32.dll and lapack_win32.dll.
I then want to compile sssimp.f against the arpack_win32.dll, blas_win32.dll and lapack_win32.dll using Intel visual fortran compiler for Windows, because sssimp.f uses those dlls. But I got the impression ( from Intel support forum) that this is not possible.
Is my impression correct? Or is it that as long as I can produce the underlying libs and dlls ( no matter in which compiler and how old it is), I can use them as my base libs and dlls, and I can link to them from any, modern or old, compiler?
g77 uses a different ABI than IVF, yes. So unless IVF has some g77/f2c compatibility option it's not going to work.
The easiest solution for you is probably to use IVF to compile the libraries too.
As already pointed out, mixing compilers with different calling conventions is likely to be very difficult.
That answer on the Intel Forum pointed out a version of arpack translated to Fortran 90 -- http://people.sc.fsu.edu/~burkardt/f_src/arpack/arpack.html -- can you use that? Also see http://people.sc.fsu.edu/~burkardt/f_src/lapack/lapack.html and http://people.sc.fsu.edu/~burkardt/f_src/blas1_s/blas1_s.html
Or Intel Visual Fortran should be able to compile Fortran 77 using suitable compiler options. What language constructs is it rejecting?

Resources