How effective is g++/gcc at unrolling recursive inline functions? - gcc

I've recursive but not tail recursive inline function for which I'd want gcc to unroll the recursion. Yes, I'm using g++ -O3 -funroll-loops of course.
inline void recurse_fun(..., unsigned depth = 0, unsigned max_depth = 40) {
if (++depth > max_depth) return;
for (auto i = ..., iend = ...; i != iend; i++) {
if (...) continue;
...
recurse_fun(...,depth,max_depth);
}
}
I could easily replace this by handling a stack<...> object manually, which gcc should unroll properly, but it would not be as quite as elegant or maintainable.
I should really try profiling both versions regardless, but I'm curious if anyone can say with confidence that some recent gcc version would or would not handle this correctly.

GCC (at least recent versions like 4.5 or 4.6) does unroll some tail recursive calls.
Of course you need to ask it to optimize (so -O2 or -O3 is required).
To understand what it is doing you can
Ask for the assembly output with something like gcc -O3 -fverbose-asm -S yoursource.c
Ask for various dump files, like gcc -c -fdump-tree-all -fdump-ipa-all -O3 yoursource.c (and there are other dump files)
Beware that GCC would print a lot (hundreds!) of dump files. And the dump files are only to help GCC developers or GCC plugin developers (or GCC MELT developpers). Don't expect them to stay in the same format from one release of GCC to the next.
The numbering of the dump files is useless: it is not chronological or logical.
And the dump options are likely to change in next GCC release (4.7, probably in 2012)

Related

How to prevent GCC from inserting memset during link-time optimization?

While developping a bare metal firmware in C for a RV32IM target (RISC-V), I encountered a linking error when LTO is enabled:
/home/duranda/riscv/lib/gcc/riscv64-unknown-elf/10.2.0/../../../../riscv64-unknown-elf/bin/ld: /tmp/firmware.elf.5cZNyC.ltrans0.ltrans.o: in function `.L0 ':
/home/duranda/whatever/firmware.c:493: undefined reference to `memset'
There are however no call to memset in my firmware. The memset is inserted by GCC during optimization as described here. The build is optimized for size using GCC -Os and -flto -fuse-linker-plugin flags. In addition, the -fno-builtin-memset -nostdinc -fno-tree-loop-distribute-patterns -nostdlib -ffreestanding flags are used to prevent the use of memset during optimization and to not include standard libs.
How to prevent memset insertion during LTO? Note that the firmware should not be linked against libc. I also tried providing a custom implementation of memset but the linker does not want to use it for memset inserted during optimization (still throws undefined reference).
I hit similar issue servers years ago and tried to fixed that, but it turns out I misunderstanding the meaning of -fno-builtin[1], -fno-builtin not guaranteed GCC won't call memcpy, memmove or memset implicitly.
I guess the simplest solution is, DO NOT compile your libc.c with -flto, or in another word, compile libc.c with -fno-lto.
That's my guess about what happen, I don't have know how to reproduce what you see, so it might incorrect,
During the first phase of LTO, LTO will collect any symbol you used in program
And then ask linker to provide those files, and discard any unused symbol.
Then read those files into GCC and optimize again, in this moment gcc using some built-in function to optimize or code gen, but it not pull-in before.
The symbol reference is created at LTO stage, which is too late pull in any symbol in current GCC LTO flow, and in this case, memset is discard in earlier stage...
So you might have question about why compile libc.c with -fno-lto will work? because if it didn't involved into LTO flow, which means it won't be discarded in the LTO flow.
Some sample program to show the gcc will call memset even you compile with -fno-builtin, aarch64 gcc and riscv gcc will generate a function call to memset.
// $ riscv64-unknown-elf-gcc x.c -o - -O3 -S -fno-builtin
struct bar {
int a[100];
};
struct bar y;
void foo(){
struct bar x = {{0}};
y = x;
}
Here is the corresponding gcc source code[2] for this case.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2014-August/397382.html
[2] https://github.com/riscv/riscv-gcc/blob/riscv-gcc-10.2.0/gcc/expr.c#L3143
I'm not sure -fno-builtin-* does what you think it does. If you use those flags, then GCC will try to call an external function. If you don't use those flags, GCC will instead just insert inline code instead of relying on the library.
So it would appear to me you should simply not use any -fno-builtin flags.

Passing multiple -std switches to g++

Is it safe to assume that running g++ with
g++ -std=c++98 -std=c++11 ...
will compile using C++11? I haven't found an explicit confirmation in the documentation, but I see the -O flags behave this way.
The GCC manual doesn't state that the
last of any mutually exclusive -std=... options specified takes effect. The first occurrence
or the last occurrence are the only alternatives. There are numerous
GCC flags that take mutually exclusive alternative values from a finite set - mutually
exclusive, at least modulo the language of a translation unit. Let's call them mutex options for short.
It is a seemingly random rarity for it to be documented that the last setting takes effect. It is
documented for the -O options as you've noted, and in general terms for mutually exclusive warning options, perhaps
others. It's never documented that the first of multiple setting takes effect, because
it's never true.
The documentation leans - with imperfect consistency - on the historical conventions
of command usage in unix-likes OSes. If a command accepts a mutex option
then the last occurrence of the option takes effect. If the command were - unusually -
to act only on the first occurrence of the option then it would be a bug for
the command to accept subsequent occurrences at all: it should give a usage error.
This is custom and practice. The custom facilitates scripting with tools that
respect it, e.g. a script can invoke a tool passing a default setting of some
mutex option but enable the user to override that setting via a parameter of the script,
whose value can simply be appended to the default invocation.
In the absence of official GCC documentation to the effect you want, you might get
reassurance by attempting to find any GCC mutex option for which it is not
the case that the last occurrence takes effect. Here's one stab:
I'll compile and link this program:
main.cpp
#include <cstdio>
#if __cplusplus >= 201103L
static const char * str = "C++11";
#else
static const char * str = "Not C++11";
#endif
int main()
{
printf("%s\n%d\n",str,str); // Format `%d` for `str` mismatch
return 0;
}
with the commandline:
g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp
which requests contradictory option pairs:
-std=c++98 -std=c++11: Conform to C++98. Conform to C++11.
-m32 -m64: Produce 32-bit code. Produce 64-bit code.
-O0 -O1: Do not optimise at all. Optimize to level 1.
-g3 -g0: Emit maximum debugging info. Emit no debugging info.
-Wformat -Wno-format. Sanity-check printf arguments. Don't sanity check them.
-o wrong -o right. Output program wrong. Output program right
It builds successfully with no diagnostics:
$ echo "[$(g++ -std=c++98 -std=c++11 -m32 -m64 -O0 -O1 -g3 -g0 \
-Wformat -Wno-format -o wrong -o right main.cpp 2>&1)]"
[]
It outputs no program wrong:
$ ./wrong
bash: ./wrong: No such file or directory
It does output a program right:
$ ./right
C++11
-1713064076
which tells us it was compiled to C++11, not C++98.
The bug exposed by the garbage -1713064076 was not diagnosed because
-Wno-format, not -Wformat, took effect.
It is a 64-bit, not 32-bit executable:
$ file right
right: ELF 64-bit LSB shared object, x86-64 ...
It was optimized -O1, not -O0, because:
$ "[$(nm -C right | grep str)]"
[]
shows that the local symbol str is not in the symbol table.
And it contains no debugging information:
echo "[$(readelf --debug-dump right)]"
[]
as per -g0, not -g3.
Since GCC is open-source software, another way of resolving doubts
about its behaviour that is available to C programmers, at least,
is to inspect the relevant source code, available via git source-control at
https://github.com/gcc-mirror/gcc.
The relevant source code for your question is in file gcc/gcc/c-family/c-opts.c,
function,
/* Handle switch SCODE with argument ARG. VALUE is true, unless no-
form of an -f or -W option was given. Returns false if the switch was
invalid, true if valid. Use HANDLERS in recursive handle_option calls. */
bool
c_common_handle_option (size_t scode, const char *arg, int value,
int kind, location_t loc,
const struct cl_option_handlers *handlers);
It is essentially a simple switch ladder over option settings enumerated by scode - which
is OPT_std_c__11 for option -std=c++11 - and leaves no doubt that it
puts an -std option setting into effect regardless of what setting was in effect previously. You can look at branches other than master
(gcc-{5|6|7}-branch) with the same conclusion.
It's not uncommon to find GCC build system scripts that rely on the validity of
overriding an option setting by appending a new setting. Legalistically, this
is usually counting on undocumented behaviour, but there's a better
chance of Russia joining NATO than of GCC ceasing to take the last setting that
it parses for a mutex option.

line number information lost during linking in gcc

I'm using Red Hat 4.4.7-3 and gcc 4.8.3
I have code in two files(test.c and sum.c) and I compiled them separately with gcc(with debug information). In the last phase when I'm making the final output by combining both files, debug information is lost.
test.c:
int main()
{
int a=5,b=7;
int c=testsum(a,b);
printf("%d + %d=%d\n",a,b,c);
return 0;
}
sum.c:
int testsum(int a, int b)
{
return a+b;
}
I did the following:
gcc -c -g test.c -o test.o
gcc -c -g sum.c -o sum.o
gcc -g test.o sum.o -o output
When I do gdb sum.o then it is showing the line number information
(gdb) l testsum
1 int testsum(int a, int b)
2 {
3 return a+b;
4 }
but with the gdb output I'm not getting line number information.
(gdb) l testsum
No line number known for testsum.
(gdb)
I repeated the same thing on my personal laptop(gcc-4.8.real (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1) and here it is working perfectly fine. But I need the debug information in the final output on the redhat machine for some project.
Any suggestions/comments regarding obtaining the line number information in final executable would be much appreciated.
You need to compile and link with gcc -g. Perhaps you forgot the -g flag at link time.
And use surely want to compile with gcc -Wall -g since warnings are incredibly useful.
You should run gdb on the ELF executable file, not on object files (so gdb sum.o is wrong):
gdb ./output
You should have a Makefile (see this example) and build your program using GNU make
Perhaps the gdb on the remote Redhat server is not accepting the same DWARF format than on your local laptop. Check the versions of gdb. (Perhaps consider compiling on the remote sever, or passing some explicit debugging option like -gdwarf-3 or whatever is appropriate for the remote gdb to your gcc laptop compiler).

What is the optimization level of `-S` switch to GCC

In this question, I meet the situation that gcc myfile.c -S produce the assembly code that is better than gcc myfile.c -O0 but worse than gcc myfile.c -O1.
At -O0, both loops are generated. At -O1, both loops are optimized out. (Thanks #Raymond Chen for reminder. cited from his comments) (using the -S just optimize one loop out)
I search the Internet and only find this:
-S (cited from Overall options)
Stop after the stage of compilation proper; do not assemble. The output is in the form of an assembler code file for each non-assembler input file specified.
By default, the assembler file name for a source file is made by replacing the suffix ‘.c’, ‘.i’, etc., with ‘.s’.
Input files that don't require compilation are ignored.
So my question is:
what is exactly the optimization level of -S option when it compile file? (-O0.5?)
why not just using the -O0 or -O1... (or it is a bug?)
Edit: you can use this site to help you reproduce the problem. Code is in the question I mentioned. ( If you just use -S compiler option(or no compiler option), you can get one loop elision. )
step 1:
Open this site and copy the following code in Code Eidtor.
#include <stdio.h>
int main (int argc, char *argv[]) {
unsigned int j = 10;
for (; j > -1; --j) {
printf("%u", j);
}
}
step 2:
Choose g++ 4.8 as compiler. Compiler option is empty.(or -S)
step 3:
You get the first situation. Now, change the j > -1 to j >= -1 and you can see the second.
With your last edit, it's now somewhat clear what you're actually doing, so:
For the 1. case, j > -1
This can never happen. j is an unsigned int, and -1 converted to an unsigned value will correspond to a value with all bits being set. That's the same as UINT_MAX, and j can never be greater than that. So gcc eliminates the loop, since its condition will always be false
For the 2. case, j >= -1:
This can happen. j can surely become (unsigned int)-1, or UINT_MAX as mentioned above. The loop is not eliminated.
what is exactly the optimization level of -S option when it compile file? (-O0.5?)
The optimization level is controlled with the -O flag. The -S does not impact optimization. The default optimization if no -O flag is given is -O0 (no optimization)
-S doesn't optimize. -O0, on the other hand, disables all and any optimizations, even the default ones.
So the effect that you see is that you're "enabling" the default optimizations if you use just -S.
Use -S with various -O options to see the effect on the assembler code.
EDIT I've been using GCC since about 2.6 (in 1994). I'm pretty sure I remember that in some versions, the compiler would do default optimizations that you could disable with -O0 to debug the compiler (i.e. gcc ... crashes, gcc -O0 ... doesn't crash -> congrats, you found a bug).
But that doesn't seem to be the case here. I get the same assembler output for -S, -O0 and not giving either. So it seems that the simple optimizations (like if(0){} to comment out a code block) are always applied, no matter which optimization level is selected.
Therefore, I'd say is that original statement above:
At -O0, both loops are generated. At -O1, both loops are optimized out. (Thanks #Raymond Chen for reminder. cited from his comments) (using the -S just optimize one loop out)
is not correct to begin with (at least for GCC 4.8.2). The only other alternative is that the GCC version used by the OP (4.8) has a bug when it comes to enabling/disabling optimizer options.

gcc warning flag for bogus comparison

I am searching for the right warning flag to use with gcc to detect something like:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const size_t n = (size_t)-1;
for( unsigned int i = 0; i < n; ++i ) /* use `unsigned char` if you want */
{
printf( "%d\n", i );
}
return 0;
}
I tried:
$ gcc -Wsign-conversion -Wconversion -pedantic -Wsign-compare -W -Wall -Wextra -std=c99 t.c
What happened is that I have been modifying an existing code, that uses unsigned int for memory block size. The code starting failing with larger file. So I need to check I did not miss any remaining left over.
EDIT:
Just discovered -Wtype-limits but again this is not working for me
You are asking the compiler to detect that the condition is always true at run-time. This is barely within its possibilities in this case, because the reason it is always true is that one side is constant and the other is limited by the unsigned int type. I am glad that you found a g++ flag that did it, but if the value of variable n was provided in a different file, or not typed as const, for instance, the compiler may be unable to detect that the condition remains true.
You may also consider using a static analyzer that spends more time than a compiler on the detection of what may and may not happen at run-time. One open-source C analyzer is Frama-C:
In the screenshot, the statements in red have been detected as unreachable.
The open-source version only works well if the programs makes limited use of library functions, but even so, it can extract information that does not appear in g++'s warnings.
Ok, found out the trick, you need to use the c++ compiler instead:
$ g++ -Wextra t.c
t.c: In function ‘int main()’:
t.c:6: warning: comparison is always true due to limited range of data type
with:
$ g++ --version
g++ (Debian 4.4.5-8) 4.4.5
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Resources