What are __CUDABE__ and __CUDA_LIBDEVICE__ for? - gcc

Let's say I'm interested on preprocessing (with gcc) hpp/cpp files which include CUDA kernel declarations. I want the preprocessor to not to scrap the __global__ specifier, otherwise I wouldn't be able to link against the definition in the .cu file.
For instance, a file t1.hpp:
__global__ void foo(int* v, int n);
And preprocess with:
gcc -E t1.hpp -I/usr/local/cuda/include -include cuda_runtime.h
But the result scraps global !:
...
# 1888 "/usr/local/cuda/include/cuda_runtime.h"
#pragma GCC diagnostic pop
# 1 "<command-line>" 2
# 1 "t1.hpp"
void foo();
But if I define __CUDABE__ (on CUDA 8.0) or __CUDA_LIBDEVICE__ in CUDA 9.0+ i amb able to keep that information:
gcc -E t1.hpp -I/usr/local/cuda/include -include cuda_runtime.h -D__CUDABE__
Final result:
...
# 1888 "/usr/local/cuda/include/cuda_runtime.h"
#pragma GCC diagnostic pop
# 1 "<command-line>" 2
# 1 "t1.hpp"
__attribute__((global)) void foo();
So my question is, what is __CUDABE__ and __CUDA_LIBDEVICE__ for and if what could be the side effects.
I've also seen that clang defines those macros in __clang_cuda_runtime_wrapper.h. Is this then this something safe to do?

Since it is not documented anywhere, it's some sort of an internal flag they use (which can, as you've noticed change between the compilers), so you probably shouldn't rely on it. It is defined in crt/host_defines.h, which is not very well documented, so I cannot decipher what it means.
Is there any reason why you cannot preprocess the file with nvcc?
This should do what you want, and it invokes gcc with correct parameters (at least on my system):
nvcc -E --x=cu t1.hpp`
If you cannot use nvcc for whatever reason, you can always call it in verbose mode (nvcc -E -v --x=cu t1.hpp) and see which flags it sets. On my linux system with CUDA 9.1 I get:
gcc -std=c++14 -D__CUDA_ARCH__=300 -E -x c++ \
-DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ \
-D__NVCC__ "-I/opt/cuda/bin/..//include" \
-D"__CUDACC_VER_BUILD__=85" -D"__CUDACC_VER_MINOR__=1" \
-D"__CUDACC_VER_MAJOR__=9" -include "cuda_runtime.h" \
-m64 "t1.hpp"
However, you'll probably have to do it for each CUDA version you want to use, as these flags can change.

Related

Passing multiple -std switches to g++

Is it safe to assume that running g++ with
g++ -std=c++98 -std=c++11 ...
will compile using C++11? I haven't found an explicit confirmation in the documentation, but I see the -O flags behave this way.
The GCC manual doesn't state that the
last of any mutually exclusive -std=... options specified takes effect. The first occurrence
or the last occurrence are the only alternatives. There are numerous
GCC flags that take mutually exclusive alternative values from a finite set - mutually
exclusive, at least modulo the language of a translation unit. Let's call them mutex options for short.
It is a seemingly random rarity for it to be documented that the last setting takes effect. It is
documented for the -O options as you've noted, and in general terms for mutually exclusive warning options, perhaps
others. It's never documented that the first of multiple setting takes effect, because
it's never true.
The documentation leans - with imperfect consistency - on the historical conventions
of command usage in unix-likes OSes. If a command accepts a mutex option
then the last occurrence of the option takes effect. If the command were - unusually -
to act only on the first occurrence of the option then it would be a bug for
the command to accept subsequent occurrences at all: it should give a usage error.
This is custom and practice. The custom facilitates scripting with tools that
respect it, e.g. a script can invoke a tool passing a default setting of some
mutex option but enable the user to override that setting via a parameter of the script,
whose value can simply be appended to the default invocation.
In the absence of official GCC documentation to the effect you want, you might get
reassurance by attempting to find any GCC mutex option for which it is not
the case that the last occurrence takes effect. Here's one stab:
I'll compile and link this program:
main.cpp
#include <cstdio>
#if __cplusplus >= 201103L
static const char * str = "C++11";
#else
static const char * str = "Not C++11";
#endif
int main()
{
printf("%s\n%d\n",str,str); // Format `%d` for `str` mismatch
return 0;
}
with the commandline:
g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp
which requests contradictory option pairs:
-std=c++98 -std=c++11: Conform to C++98. Conform to C++11.
-m32 -m64: Produce 32-bit code. Produce 64-bit code.
-O0 -O1: Do not optimise at all. Optimize to level 1.
-g3 -g0: Emit maximum debugging info. Emit no debugging info.
-Wformat -Wno-format. Sanity-check printf arguments. Don't sanity check them.
-o wrong -o right. Output program wrong. Output program right
It builds successfully with no diagnostics:
$ echo "[$(g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp 2>&1)]"
[]
It outputs no program wrong:
$ ./wrong
bash: ./wrong: No such file or directory
It does output a program right:
$ ./right
C++11
-1713064076
which tells us it was compiled to C++11, not C++98.
The bug exposed by the garbage -1713064076 was not diagnosed because
-Wno-format, not -Wformat, took effect.
It is a 64-bit, not 32-bit executable:
$ file right
right: ELF 64-bit LSB shared object, x86-64 ...
It was optimized -O1, not -O0, because:
$ "[$(nm -C right | grep str)]"
[]
shows that the local symbol str is not in the symbol table.
And it contains no debugging information:
echo "[$(readelf --debug-dump right)]"
[]
as per -g0, not -g3.
Since GCC is open-source software, another way of resolving doubts
about its behaviour that is available to C programmers, at least,
is to inspect the relevant source code, available via git source-control at
https://github.com/gcc-mirror/gcc.
The relevant source code for your question is in file gcc/gcc/c-family/c-opts.c,
function,
/* Handle switch SCODE with argument ARG. VALUE is true, unless no-
form of an -f or -W option was given. Returns false if the switch was
invalid, true if valid. Use HANDLERS in recursive handle_option calls. */
bool
c_common_handle_option (size_t scode, const char *arg, int value,
int kind, location_t loc,
const struct cl_option_handlers *handlers);
It is essentially a simple switch ladder over option settings enumerated by scode - which
is OPT_std_c__11 for option -std=c++11 - and leaves no doubt that it
puts an -std option setting into effect regardless of what setting was in effect previously. You can look at branches other than master
(gcc-{5|6|7}-branch) with the same conclusion.
It's not uncommon to find GCC build system scripts that rely on the validity of
overriding an option setting by appending a new setting. Legalistically, this
is usually counting on undocumented behaviour, but there's a better
chance of Russia joining NATO than of GCC ceasing to take the last setting that
it parses for a mutex option.

Should OCaml compilation with custom linking work in Windows (via MinGW)?

I want to compile an OCaml program interfacing with C code, using a MinGW-based GCC, and using separate compilation (GCC produces the .o, then ocamlopt produces the final executable).
It's not clear to me if (1) this should work on Windows and, if so, (2) which command-line arguments are necessary.
I'm using Jonathan Protzenko's OCaml on Windows installer to install OCaml 4.02.1 along with a Cygwin shell (note that it uses a native windows OCaml compiler, not a Cygwin-based one). I installed gcc using Nuwen's MinGW (but had the same issue when using Strawberry Perl's gcc).
Here's my source code:
C file (tc.c):
#include <stdio.h>
#include "caml/mlvalues.h"
value print(value unused) {
printf("hello from C\n");
return Val_unit;
}
OCaml file (t.ml):
external print : unit -> unit = "print"
let () =
Printf.printf "platform: %s\n" (Sys.os_type);
print ();
The following works just fine:
and#win7 $ ocamlopt t.ml tc.c -o t.exe
and#win7 $ ./t.exe
platform: Win32
hello from C
However, if I use a .o instead of a .c, it doesn't work:
and#win7 $ gcc tc.c -c -I c:/OCaml/lib -o tc.o
and#win7 $ ocamlopt t.ml tc.o -o t.exe
** Cannot resolve symbols for tc.o:
puts
** Fatal error: Unsupported relocation kind 0004 for puts in tc.o
File "caml_startup", line 1:
Error: Error during linking
Both versions work fine on Linux.
I wonder if it's just some silly mistake that I can quickly fix by giving the right arguments to gcc/ocamlc/ocamlopt, or if it's a current limitation of OCaml's native compilation on Windows.
Edit: camlspotter identified the cause, so in retrospect, I did not need Nuwen's MinGW at all. OCaml on Windows already includes a MinGW-based C compiler, except that it is called i686-w64-mingw32-gcc and not gcc.
You are probably using a wrong C compiler or without appropriate options. The best way is to use the same C compiler + options used to build OCaml. You can check it by ocamlc -config:
$ ocamlc -config
version: 4.02.3
standard_library_default: C:/ocamlmgw64/lib
standard_library: C:/ocamlmgw64/lib
standard_runtime: ocamlrun
ccomp_type: cc
bytecomp_c_compiler: x86_64-w64-mingw32-gcc -O -mms-bitfields -Wall -Wno-unused
bytecomp_c_libraries: -lws2_32
native_c_compiler: x86_64-w64-mingw32-gcc -O -mms-bitfields -Wall -Wno-unused
native_c_libraries: -lws2_32
native_pack_linker: x86_64-w64-mingw32-ld -r -o
ranlib: x86_64-w64-mingw32-ranlib
...
For example, the above shows that my OCaml compiler is built over Cygwin 32 bit environment with x86_64-w64-mingw32-gcc. The same applies for the linker and ranlib. Since you can compile C with OCaml code with ocamlopt, the same C compiler must be already installed in your environment.
Building OCaml compiler by yourself to make sure the same C compiler is used both for C and OCaml may be the best way to avoid this sort of C compiler mismatch.

How to (cross-)compile to both ARM hard- and soft-float (softfp) with a single GCC (cross-)compiler?

I'd like to use a single (cross-)compiler to compile code for different ARM calling conventions: since I always want to use floating point and NEON instructions, I just want to select the hard-float calling convention or the soft-float (softfp) calling convention.
My compiler defaults to hard-float, but it supports both architectures that I need:
$ arm-linux-gnueabihf-gcc -print-multi-lib
.;
arm-linux-gnueabi;#marm#march=armv4t#mfloat-abi=soft
$
When I compile with the default parameters:
$ arm-linux-gnueabihf-g++ -Wall -o hello_world_armhf hello_world.cpp
It succeeds without any errors.
If I compile with the parameters returned by -print-multi-lib:
$ arm-linux-gnueabihf-g++ -marm -march=armv4t -mfloat-abi=soft -Wall -o hello_world hello_world.cpp
It again compiles without error (By the way, how can I test that the resultant code is hard- or soft-float?)
Unfortunately, if I try this:
$ arm-linux-gnueabihf-g++ -march=armv7-a -mthumb-interwork -mfloat-abi=softfp -mfpu=neon -Wall -o hello_world hello_world.cpp
[...]/gcc/bin/../lib/gcc/arm-linux-gnueabihf/4.7.3/../../../../arm-linux-gnueabihf/bin/ld: error: hello_world uses VFP register arguments, /tmp/ccwvfDJo.o does not
[...]/gcc/bin/../lib/gcc/arm-linux-gnueabihf/4.7.3/../../../../arm-linux-gnueabihf/bin/ld: failed to merge target specific data of file /tmp/ccwvfDJo.o
collect2: error: ld returned 1 exit status
$
I've tested some other permutations of the parameters, but it seems that anything other than the combination shown by -print-multi-lib results in an error.
I've read ARM compilation error, VFP registered used by executable, not object file but the problem there was that some parts of the binary were soft- and some were hard-float. I have a single C++ file to compile...
What parameter(s) I miss to be able to compile with -march=armv7-a -mthumb-interwork -mfloat-abi=softfp -mfpu=neon?
How is it possible that the error is about VFP register arguments while I explicitly have -mfloat-abi=softfp in the command line which prohibits VFP register arguments?
Thanks!
For the records, hello_world.cpp contains the following:
#include <iostream>
int main()
{
std::cout << "Hello, world!" << std::endl;
return 0;
}
You need another compiler with corresponding multilib support.
You can check multilib support with next command.
arm-none-eabi-gcc -print-multi-lib
.;
thumb;#mthumb
fpu;#mfloat-abi=hard
armv6-m;#mthumb#march=armv6s-m
armv7-m;#mthumb#march=armv7-m
armv7e-m;#mthumb#march=armv7e-m
armv7-ar/thumb;#mthumb#march=armv7
cortex-m7;#mthumb#mcpu=cortex-m7
armv7e-m/softfp;#mthumb#march=armv7e-m#mfloat-abi=softfp#mfpu=fpv4-sp-d16
armv7e-m/fpu;#mthumb#march=armv7e-m#mfloat-abi=hard#mfpu=fpv4-sp-d16
armv7-ar/thumb/softfp;#mthumb#march=armv7#mfloat-abi=softfp#mfpu=vfpv3-d16
armv7-ar/thumb/fpu;#mthumb#march=armv7#mfloat-abi=hard#mfpu=vfpv3-d16
cortex-m7/softfp/fpv5-sp-d16;#mthumb#mcpu=cortex-m7#mfloat-abi=softfp#mfpu=fpv5-sp-d16
cortex-m7/softfp/fpv5-d16;#mthumb#mcpu=cortex-m7#mfloat-abi=softfp#mfpu=fpv5-d16
cortex-m7/fpu/fpv5-sp-d16;#mthumb#mcpu=cortex-m7#mfloat-abi=hard#mfpu=fpv5-sp-d16
cortex-m7/fpu/fpv5-d16;#mthumb#mcpu=cortex-m7#mfloat-abi=hard#mfpu=fpv5-d16
https://stackoverflow.com/questions/37418986/how-to-interpret-the-output-of-gcc-print-multi-lib
How to interpret the output of gcc -print-multi-lib
With this configuration gcc -mfloat-abi=hard not only will build your files using FPU instructions but also link them with corresponding libs, avoiding "X uses VFP register arguments, Y does not" error.
The above-mentioned -print-multi-lib output produced by gcc with this patch and --with-multilib-list=armv6-m,armv7,armv7-m,armv7e-m,armv7-r,armv7-a,cortex-m7 configuration option.
If you are interested in building your own gcc with Cortex-A series multilib support, just use --with-multilib-list=aprofile configuration option for any arm*-*-* target without any patches (at list with gcc-6.2.0).
As per Linaro FAQ if your compiler prints arm-linux-gnueabi;#marm#march=armv4t#mfloat-abi=soft then you can only use -march=armv4t. If you want to use -march=armv7-a you need to build compiler yourself.
Following link could be helpful in building yourself GCC ARM Builds

How to overrule default gcc options to the linker?

On my system when I compile something (with bfin-linux-uclibc-g++ but that is irrelevant), I get hundreds of warnings (not in my own code base) with respect to one of the compiler flags. I want to disable it.
fde encoding in src/SpiMessageUtil.o(.eh_frame) prevents .eh_frame_hdr table being created.
This orginates from a default gcc flag which is handed over to the linker, which is easy to check by adding '-v' to the compilation step:
COLLECT_GCC_OPTIONS=... --eh-frame-hdr ...
I would like to get rid of this option, which is indeed by default defined:
bfin-linux-uclibc-g++ -dumpspecs | grep frame-hdr
%{!static:--eh-frame-hdr}\
%{mfdpic: -m elf32bfinfd -z text} %{shared} %{pie} \
%{static:-dn -Bstatic} %{shared:-G -Bdynamic} \
%{!shared: %{!static: %{rdynamic:-export-dynamic} \
%{!dynamic-linker:-dynamic-linker \
%{mglibc:%{muclibc:%e-mglibc and -muclibc used together;:%e-mglibc not supported for this target};:/lib/ld-uClibc.so.0 \
}}}\
%{static}} -init __init -fini __fini
How can I override this option? I cannot use -Wl,--no-eh-frame-hdr, because there is nothing like that defined.
You can dynamically dump GCC's specs, remove this switch from there and use it when linking, i.e.:
g++ -dumpspecs | sed -e 's,--eh-frame-hdr,,g' > better_specs
g++ -specs=better_specs -o target file1.o file2.o -llib1...
This would replace the specs inline, while keeping original compiler intact.
If you keep your own Makefiles, this could also be handled with something like:
$(TARGET): $(OBJS) | better_specs
$(LINK.o) $(OUTPUT_OPTION) -specs=$| $^
better_specs:
$(CXX) -dumpspecs | sed -e 's,--eh-frame-hdr,,g' > $#
This approach could be also used with configure scripts, provided that you generate better_specs before, you could just use ./configure CXX='g++ -specs=/path/to/better_specs'.
I just got started with back-porting some code to an old system with a bfin controller and ran into the problem with these terribly annoying warnings - 1000s at a time. I didn't find a way to just deactivate the output.
But there are 2 "ways to go" that work:
Fix the source and rebuild the tool-chain:
Remove the code that creates the output in elf-eh-frame.c in the function _bfd_elf_discard_section_eh_frame:
(*info->callbacks->einfo)
(_("%P: fde encoding in %B(%A) prevents .eh_frame_hdr"
" table being created.\n"), abfd, sec);
Patch the ld binary
Take a look at the ld-Binary and patch the binary directly.
I dumped the data segment (.rodata) with objdump to find the address of the string. Then (after creating a disassembly with objdump) I searched where that string was used and replaced the call to the function that creates the output with two NoOps (0xFF 0xD3 -> 0x90 0x90).
Linker still creates the same output, but no more messages.

i386 macro predefined in make or gcc?

I've been attempting to make a folder for each architecture my code can support. In this folder are platform specific files to include. I include them as follows:
#define STR(x) #x
#define ASSTR(x) STR(x)
#include ASSTR(ARCHITECTURE/sizes.h)
My compilation line in make looks like this:
gcc -o $# -c $< -DARCHITECTURE=i386
Which works, until I define ARCHITECTURE to be i386. When this happens, it looks for 1/sizes.h, so I assume it's already defined somewhere.
I believe the C preprocessor (cpp), which is called by gcc, defines i386 (for i386 systems). You can find out what it defines like so:
touch foo.h; cpp -dM foo.h; rm foo.h
This method is described by the cpp man page, under -d, with the character M (so, -dM):
Instead of the normal output, generate a list of #define directives for all the macros defined during the execution of the preprocessor, including predefined macros. This gives you a way of finding out what is predefined in your version of the preprocessor. Assuming you have no file foo.h, the command
touch foo.h; cpp -dM foo.h
will show all the predefined macros.

Resources