Inline functions and link time optimizations - gcc

I am trying to study the effects of inlining and link-time optimization currently I am trying to link two files with one of them having an explicit inline function call
below are the files:
test.c
void show(int *arr,int n ){
for(int i=0;i<n;i++)
printf("%d",arr[i]);
}
int main(){
int arr[10]={0};
foo(arr);
}
test1.c
inline void foo(int *arr){
show(arr,10);
}
so I am trying to compile them as
gcc -c -O2 -flto test.c
gcc -c -O2 -flto test1.c
gcc -o myprog -flto -O2 test1.o test.o
but this still gives a linker error as in undefined reference to foo in main
Did I compile it correctly? Is gcc failing to inline?
Any help would be appreciated

From C11 6.7.4p7 emphasis mine:
Any function with internal linkage can be an inline function. For a
function with external linkage, the following restrictions apply: If a
function is declared with an inline function specifier, then it shall
also be defined in the same translation unit. If all of the file scope
declarations for a function in a translation unit include the inline
function specifier without extern, then the definition in that
translation unit is an inline definition. An inline definition does
not provide an external definition for the function, and does not
forbid an external definition in another translation unit. An inline
definition provides an alternative to an external definition, which a
translator may use to implement any call to the function in the same
translation unit. It is unspecified whether a call to the function
uses the inline definition or the external definition
Your code is does not provide a definition of foo with external linkage and compiler is right - there is no foo. The foo in test1.c is an inline function, it doesn't provide a function with external definition.
Did I compile it correctly?
Well, yes.
Is gcc failing to inline?
The compilation failed, so yes.
The keyword inline may serve as a hint for the compiler to maybe inline the function. There is no requirement that compiler will do that. inline is a misleading keyword - it serves primarily to modify linkage of objects, so to let the compiler choose between inline and non-inline versions of the same function available within the same transaction unit. It does not mean that the function will be inlined.
If you are using LTO, just drop the inline, there is no point in hinting the compiler - trust gcc it will do a better job optimizing then you will, with LTO it "sees" all the functions in single transaction unit anyway. Also read gcc docs on inline and remember about the rules of optimization.

Related

What is the VC++ compiler option /Oi equivalent for gcc?

/Oi allows the VC++ compiler to use intrinsics. What is the equivalent in gcc 9 or 10?
Would gcc -O3 compiler option enable the use of intrinsics?
What you are searching is Built-in Functions.
https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Other-Builtins.html
Builtins are enabled by default :
-fno-builtin
-fno-builtin-function
Don’t recognize built-in functions that do not begin with ‘__builtin_’ as prefix. See Other built-in functions provided by GCC, for details of the functions affected, including those which are not built-in functions when -ansi or -std options for strict ISO C conformance are used because they do not have an ISO standard meaning.
GCC normally generates special code to handle certain built-in functions more efficiently; for instance, calls to alloca may become single instructions which adjust the stack directly, and calls to memcpy may become inline copy loops. The resulting code is often both smaller and faster, but since the function calls no longer appear as such, you cannot set a breakpoint on those calls, nor can you change the behavior of the functions by linking with a different library. In addition, when a function is recognized as a built-in function, GCC may use information about that function to warn about problems with calls to that function, or to generate more efficient code, even if the resulting code still contains calls to that function. For example, warnings are given with -Wformat for bad calls to printf when printf is built in and strlen is known not to modify global memory.
With the -fno-builtin-function option only the built-in function function is disabled. function must not begin with ‘__builtin_’. If a function is named that is not built-in in this version of GCC, this option is ignored. There is no corresponding -fbuiltin-function option; if you wish to enable built-in functions selectively when using -fno-builtin or -ffreestanding, you may define macros such as:
#define abs(n) __builtin_abs ((n))
#define strcpy(d, s) __builtin_strcpy ((d), (s))
https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#C-Dialect-Options
When you see the optimisations which turn on with O3, there is no mention of builtin. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Creating and linking static rust library and link to c

I tried to create a rust library that is callable by a c program, so far i managed to create a dynamic library and call it (library created using rustc --crate-type=cdylib src/lib.rs -o libCustomlib.so, linked using gcc main.o -lCustomlib).
When i now take the same code but compile it as a static library (rustc --crate-type=staticlib src/lib.rs -o libCustomlib.a) gcc throws errors when linking (using gcc main.o -L. -l:libCustomlib.a)
the errors are all undefined references to various functions
first few lines:
/usr/bin/ld: ./libCustomlib.a(std-b1b61f01951b016b.std.5rqysbiy-cgu.2.rcgu.o): in function `std::sys::unix::mutex::Mutex::init':
/usr/src/rustc-1.43.0//src/libstd/sys/unix/mutex.rs:46: undefined reference to `pthread_mutexattr_init'
/usr/bin/ld: /usr/src/rustc-1.43.0//src/libstd/sys/unix/mutex.rs:48: undefined reference to `pthread_mutexattr_settype'
/usr/bin/ld: /usr/src/rustc-1.43.0//src/libstd/sys/unix/mutex.rs:52: undefined reference to `pthread_mutexattr_destroy'
full error is over 100 lines long but the lines are all of this form
the lib.rs currently only has one test helloWorld function:
#[no_mangle]
pub extern "C" fn fn_test() {
println!("Hello, world!");
}
with the header file included in the caller part being:
extern void fn_test();
The question is, is my error at creating the static library or at linking it? Or lies the problem somewhere else and it should not work with static libraries? Should i just use the dynamic approach (which i would like to avoid since static ones feel more like using multiple languages in one exe since you don't have to distribute the library)?
(disclaimer: for everyone asking why i would do something like that without a good reason: it's a fun project, the entire program should be as overcomplicated as possible and that's the reason why i want to use different languages)
On Linux, std dynamically links to pthreads and libdl. You need to link these in as well to create the executable:
gcc main.o libCustomlib.a -lpthread -ldl
The result is a binary that links dynamically to a handful of fundamental libraries, but statically to Customlib.
If you want a purely statically linked binary, you will probably need to use no_std and enable only the specific features of core that do not depend on dynamically linked system libraries. (Certain libraries cannot be statically linked on Linux; read Statically linking system libraries, libc, pthreads, to aid in debugging) Just for a toy program like hello, world you may get away with simply passing -static to gcc, but for anything robust it's better to dynamically link these fundamental libraries.

linker - use own stdlib implementation

I have a problem. Requirement for the project is that we cannot link our app with standard library ( so -nostdlib is on in gcc).
my_stdlib.c contains implementation of all functions my_memset, my_memcpy ... but linker needs memcpy to copy structs
MyStruct struct = my_struct;
and is complaining about "undefined reference to `memcpy'", which is of course correct.
Is it possible to remap memcpy to my_memcpy using linker script, parameters passed to ld or other way, so linker can use our implementation to copy structs?
Probably -wrap,function could help but I cannot change my_memcpy to __wrap_memcpy.
At the GCC level, you can redirect the memcpy symbol to a different symbol using:
void *memcpy (void *, const void *, size_t) __asm__ ("my_memcpy");
This will apply to internally-generated memcpy calls, too. (With GCC. I think it does not change the internal call sites with Clang.)
compile with -fno-builtin. This should avoid it.

Why does gcc4.9 not display warning message -std=c++1y while using C++14 feature?

I installed gcc4.9 using the steps mentioned in the SO post here. I was using the latest feature std::exchange() utility function which is introduced in C++14.
#include<list>
#include<utility>
int main() {
std::list<int> lin{5,6,7,8,9};
auto lout = std::exchange(lin, {1,2,3,4});
return 0;
}
I performed following steps to compile the above sample program and got the following compilation error. After sometime I realized that (as there is no warning/hint by compiler message) this feature has been added in the C++14 standard so I need to use -std=c++1y here.
$g++ -std=c++11 main.cpp
main.cpp: In function ‘int main()’:
main.cpp:5:14: error: ‘exchange’ is not a member of ‘std’
auto lout = std::exchange(lin, {1,2,3,4});
^
If we use the C++11 standard feature and does not provide -std=c++11, then GCC gives warning message/hint that your program is using the feature which is introduced in the C++11 as below:
main.cpp:4:21: warning: extended initializer lists only available with
-std=c++11 or -std=gnu++11
std::list<int> lin{5,6,7,8,9};
This message is great and lets the user distinguish between the actual compilation error message and not including -std=c++11 option.
However while using gcc4.9 for C++1y feature under -std=c++11, there is no such warning message/hint? I wanted to know what could be the possible reason for this?.
The error/warning about "extended initializer lists" is emitted by the C++ parser. The C++
parser apparently knows how to parse that syntactic construct, understands it and can
provide a sensible error/warning message.
With the function, the situation is a little bit different. The GCC proper does not
contain knowledge about each and every standard function. For some functions it does, but
for most functions it doesn't.
From the compiler proper point of view, std::exchange is just an unknown identifier, the compiler does not contain special knowledge about the standard function std::exchange, and, hence, treats it as any other unknown identifier.

Getting GCC to compile without inserting call to memcpy

I'm currently using GCC 4.5.3, compiled for PowerPC 440, and am compiling some code that doesn't require libc. I don't have any direct calls to memcpy(), but the compiler seems to be inserting one during the build.
There are linker options like -nostdlib, -nostartfiles, -nodefaultlibs but I'm unable to use them as I'm not doing the linking phase. I'm only compiling. With something like this:
$ powerpc-440-eabi-gcc -O2 -g -c -o output.o input.c
If I check the output.o with nm, I see a reference to memcpy:
$ powerpc-440-eabi-nm output.o | grep memcpy
U memcpy
$
The GCC man page makes it clear how to remove calls to memcpy and other libc calls with the linker, but I don't want the compiler to insert them in the first place, as I'm using a completely different linker (not GNU's ld, and it doesn't know about libc).
Thanks for any help you can provide.
There is no need to -fno-builtins or -ffreestanding as they will unnecessarily disable many important optimizations
This is actually "optimized" by gcc's tree-loop-distribute-patterns, so to disable the unwanted behavior while keeping the useful builtin capabilities, you can just use:
-fno-tree-loop-distribute-patterns
Musl-libc uses this flag for its build and has the following note in their configure script (I looked through the source and didn't find any macros, so this should be enough)
# Check for options that may be needed to prevent the compiler from
# generating self-referential versions of memcpy,, memmove, memcmp,
# and memset. Really, we should add a check to determine if this
# option is sufficient, and if not, add a macro to cripple these
# functions with volatile...
# tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
You can also add this as an attribute to individual functions in gcc using its optimize attribute, so that other functions can benefit from calling mem*()
__attribute__((optimize("no-tree-loop-distribute-patterns")))
size_t strlen(const char *s){ //without attribute, gcc compiles to jmp strlen
size_t i = -1ull;
do { ++i; } while (s[i]);
return i;
}
Alternatively, (at least for now) you may add a confounding null asm statement into your loop to thwart the pattern recognition.
size_t strlen(const char *s){
size_t i = -1ull;
do {
++i;
asm("");
} while (s[i]) ;
return i;
}
Gcc emits call to memcpy in some circumstance, for example if you are copying a structure.
There is no way to change GCC behaviour but you can try to avoid this by modifying your code to avoid such copy. Best bet is to look at the assembly to figure out why gcc emitted the memcpy and try to work around it. This is going to be annoying though, since you basically need to understand how gcc works.
Extract from http://gcc.gnu.org/onlinedocs/gcc/Standards.html:
Most of the compiler support routines used by GCC are present in libgcc, but there are a few exceptions. GCC requires the freestanding environment provide memcpy, memmove, memset and memcmp. Finally, if __builtin_trap is used, and the target does not implement the trap pattern, then GCC will emit a call to abort.
You need to disable a that optimization with -fno-builtin. I had this problem once when trying to compile memcpy for a C library. It called itself. Oops!
You can also make your binary a "freestanding" one:
The ISO C standard defines (in clause 4) two classes of conforming implementation. A conforming hosted implementation supports the whole standard [...]; a conforming freestanding implementation is only required to provide certain library facilities: those in , , , and ; since AMD1, also those in ; and in C99, also those in and . [...].
The standard also defines two environments for programs, a freestanding environment, required of all implementations and which may not have library facilities beyond those required of freestanding implementations, where the handling of program startup and termination are implementation-defined, and a hosted environment, which is not required, in which all the library facilities are provided and startup is through a function int main (void) or int main (int, char *[]).
An OS kernel would be a freestanding environment; a program using the facilities of an operating system would normally be in a hosted implementation.
(paragraph added by me)
More here. And the corresponding gcc option/s (keywords -ffreestanding or -fno-builtin) can be found here.
This is quite an old question, but I've hit the same issue, and none of the solutions here worked.
So I defined this function:
static __attribute__((always_inline)) inline void* imemcpy (void *dest, const void *src, size_t len) {
char *d = dest;
const char *s = src;
while (len--)
*d++ = *s++;
return dest;
}
And then used it instead of memcpy. This has solved the inlining issue for me permanently. Not very useful if you are compiling some sort of library though.

Resources