How can I link __sync_bool_compare_and_swap_16? - gcc

I've got some code that uses the gcc builtin __sync_bool_compare_and_swap, which is mapped to __sync_bool_compare_and_swap_16 at linktime. But when I link this code I get a "undefined reference to `__sync_bool_compare_and_swap_16'" linker-error. What do I have to link ?
[EDIT]: I got it: I've to compile it with -march:x86-64. Interestingly this doesn't lead to intrinsic compilation, i.e. the compiler inserts the atomic operations appropriately; no, the code is just the same with the call to __sync_bool_compare_and_swap_16, but without a linker error. Does anynone understand this ?

On my system, passing -march=native to gcc solved the problem. I suspect (but am not sure) that is is because the 128-bit bool_compare_and_swap doesn't exist on all 64 bit cores (https://cbloomrants.blogspot.com/2010/05/05-29-10-lock-free-in-x64.html) and so -march=x86_64 (which is what my gcc defaults to) doesn't allow that.

Related

"undefined reference to memcpy" during u-boot-spl build. How can I use the archtecture assisted memcpy?

When I build u-boot-spl for our board, I see these link errors. (u-boot version v2021.10, commit 50c84208ad, Tom Rini, Oct 4 2021)
u-boot/common/spl/spl.c:669: undefined reference to `memcpy'
u-boot/common/spl/spl.c:684: undefined reference to `mem_malloc_init'
...
But arch/arm/Kconfig says (my board is ARM64)
config USE_ARCH_MEMCPY
bool "Use an assembly optimized implementation of memcpy"
default y if !ARM64
depends on !ARM64 || (ARM64 && (GCC_VERSION >= 90400))
help
Enable the generation of an optimized version of memcpy.
Such an implementation may be faster under some conditions
but may increase the binary size.
So if ARM64 and GCC version is later then 9.04, USE_ARCH_MEMCPY should be turned on.
And in my case, I can check in include/config/auto.conf the two condition is true.
CONFIG_ARM64=y
CONFIG_GCC_VERSION=100201
But USE_ARCH_MEMCPY doesn't appare in include/cofig/auto.conf (is this normal?).
Anyway, I think this CONFIG_USE_ARCH_MEMCPY should be y. Why does it give me this memcpy undefined error?
I checked in lib/Makefile, I see
obj-y += string.o
which is unconditional and this string.c contains memcpy function. But of course this function is enclosed by #ifndef __HAVE_ARCH_MEMCPY, so it's not what I want anyway.
Is there any option I should turn on to make use of this hardware assisted memcpy?
ADD : I tried adding CONFIG_LTO but it didn't work.
I searched the Kconfig files and found there are CONFIG_USE_ARCH_MEMCPY, CONFIG_SPL_USE_ARCH_MEMCPY, CONFIG_USE_ARCH_MEMSET, CONFIG_SPL_USE_ARCH_MEMSET, etc. So I made those configs to be selected for my board. And those errors are gone (with CONFIG_LTO-link time optimization is set so that stdlib is not used). I still have some more 'undefined' errors for strncmp, timer_init, puts, hang, mem_malloc_init etc. I'm not sure I can fix the errors using similar methods or this approach is the correct, advisable method. And if I prefer software routine, what should I do?
Waiting for a better answer.
ADD (2021. 11. 29) :
Ovidiu Panait from u-boot email list told me I can remove some 'undefined symbol' error by setting CONFIG_SPL_LIBGENERIC_SUPPORT and CONFIG_SPL_LIBCOMMON_SUPPORT to 'y'. After setting this I found those memcpy, memset link errors are also gone without setting USE_ARCH_xxx or USE_SPL_ARCH_xxx configs.

Manual linking to Windows library in LLVM

I'm writing my own programming language, and am wanting to compile it to native binaries by compiling to LLVM IR, then letting the rest of the LLVM toolchain take over. Eventually, I will target multiple platforms, but for now I'm just focusing on Windows.
For starters, I have the empty program compiling, which implies that in general my toolchain is set up and working, and I get a no op executable out of it. The next logical step is doing "Hello World", but after looking at the LLVM IR output of clang of the C program that simply calls puts("Hello World!") it looks like a slightly easier first step is to simply _exit();. When reviewing the clang output of that C program, it looks like the relevant line is to do call void #_exit(i32 0). I've distilled it down to what I think is the bare minimum program which calls exit:
define i64* #main() {
%1 = alloca i32, align 4
store i32 0, i32* %1, align 4
call void #_exit(i32 0)
unreachable
}
declare dso_local void #_exit(i32)
When trying to run the equivalent C program directly, of course it works when I use clang directly, but the steps after creation of the LLVM IR are opaque to me, and I believe I'm using the wrong linker options or something, as I get lld-link: error: undefined symbol: _exit during the lld-link step. (Actually, this also occurs when I try to manually link the output of clang -S --emit-llvm, so I have no reason to believe my IR is the problem). The current invocation I'm using for lld-link is lld-link /out:"exit.exe" /entry:main exit.obj. I've tried playing around with adding various flavors of the /defaultlib switch, including manually linking to libcmt both libucrt which I do believe contain _exit after looking through the symbols with dumpbin, but that doesn't seem to help. Looking at the IR output of the clang program, it doesn't seem like there's any particular reference to <stdlib.h>, so I guess that information is lost after the IR generation stage, so I don't think I'm missing anything in my IR.
This appears to be a general Windows linker problem, rather than anything to do with LLVM, since if I do link /out:exit.exe /entry:main exit.obj I get basically the same error.
Anyways, there's clearly some step here during the linking that I don't understand, around how to find the actual library that a given external call lives in, so if anyone could point me in the right direction of how to figure this out for any given C runtime call, that would be great. Particularly in this case, I guess I need to find the library which contains the _exit function. Thanks!
Turns out the libcmt has been replaced. The replacement is ucrt, and so doing /defaultlib:ucrt seems to fix the problem!

Do I have to include some header files in order to use GCC built_in function?

The motivation is that I want to tell the compiler that my float *U array is 64 bytes aligned so that the compiler can do the vectorizations.
If using Intel compiler, I can use the __assume_aligned(U,64);I googled and found that if I want to do the same thing using GCC, I have to define another float *U_tmp=__builtin_assume_aligned(U,64), and use U_tmp. However, when it goes to compilation with GCC, the compiler reports that
"error: ‘__builtin_assume_aligned’ was not declared in this scope"
I don't know if I have missed some libraries or header files containing this GCC built in function.
This is supposed to work out of the box, without any additional headers. However, this has been added only to GCC 4.7, maybe your compiler is older than that?

Static library "interface"

Is there any way to tell the compiler (gcc/mingw32) when building an object file (lib*.o) to only expose certain functions from the .c file?
The reason I want to do this is that I am statically linking to a 100,000+ line library (SQLite), but am only using a select few of the functions it offers. I am hoping that if I can tell the compiler to only expose those functions, it will optimize out all the code of the functions that are never needed for those few I selected, thus dratically decreasing the size of the library.
I found several possible solutions:
This is what I asked about. It is the gcc equivalent of Windows' dllexpoort:
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Code-Gen-Options.html (-fvisibility)
http://gcc.gnu.org/wiki/Visibility
I also discovered link-time code-generation. This allows the linker to see what parts of the code are actually used and get rid of the rest. Using this together with strip and -fwhole-program has given me drastically better results.
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html (see -flto and -fwhole-program)
Note: This flag only makes sense if you are not compiling the whole program in one call to gcc, which is what I was doing (making a sqlite.o file and then statically linking it in).
The third option which I found but have not yet looked into is mentioned here:
How to remove unused C/C++ symbols with GCC and ld?
That's probably the linkers job, not the compilers. When linking that as a program (.exe), the linker will take care of only importing the relevant symbols, and when linking a DLL, the __dllexport mechanism is probably what you are looking for, or some flags of ld can help you (man ld).

Is there a way to strip all functions from an object file that I am not using?

I am trying to save space in my executable and I noticed that several functions are being added into my object files, even though I never call them (the code is from a library).
Is there a way to tell gcc to remove these functions automatically or do I need to remove them manually?
If you are compiling into object files (not executables), then a compiler will never remove any non-static functions, since it's always possible you will link the object file against another object file that will call that function. So your first step should be declaring as many functions as possible static.
Secondly, the only way for a compiler to remove any unused functions would be to statically link your executable. In that case, there is at least the possibility that a program might come along and figure out what functions are used and which ones are not used.
The catch is, I don't believe that gcc actually does this type of cross-module optimization. Your best bet is the -Os flag to optimize for code size, but even then, if you have an object file abc.o which has some unused non-static functions and you link statically against some executable def.exe, I don't believe that gcc will go and strip out the code for the unused functions.
If you truly desperately need this to be done, I think you might have to actually #include the files together so that after the preprocessor pass, it results in a single .c file being compiled. With gcc compiling a single monstrous jumbo source file, you stand the best chance of unused functions being eliminated.
Have you looked into calling gcc with -Os (optimize for size.) I'm not sure if it strips unreached code, but it would be simple enough to test. You could also, after getting your executable back, 'strip' it. I'm sure there's a gcc command-line arg to do the same thing - is it --dead_strip?
In addition to -Os to optimize for size, this link may be of help.
Since I asked this question, GCC 4.5 was released which includes an option to combine all files so it looks like it is just 1 gigantic source file. Using that option, it is possible to easily strip out the unused functions.
More details here
IIRC the linker by default does what you want ins some specific cases. The short of it is that library files contain a bunch of object files and only referenced files are linked in. If you can figure out how to get GCC to emit each function into it's own object file and then build this into a library you should get what you are looking.
I only know of one compiler that can actually do this: here (look at the -lib flag)

Resources