Is there an gcc/Xcode pragma to suppress warnings? - xcode

Is there a #pragma to have gcc/Xcode suppress specific warnings, similar to Java's #SuppressWarning annotation?
I compile with -Wall as a rule, but there are some situations where I'd like to just ignore a specific warning (e.g. while writing some quick/dirty code just to help debug something).
I'm not looking for "fix the code" answers.

Here's a viable solution. Use #pragma GCC system_header to let GCC handle your code in a very special way, thus suppressing any non fatal #warning.
Remeber you're just fooling your preprocessor, not the real compiler. Suppressing warnings could be harmful most of times.

In gcc4.6 and later you can use pragma's to suppress specific warnings and do that suppression only to a specific block of code, i.e. :
#pragma GCC diagnostic
push #pragma GCC diagnostic ignored "-Wdeprecated-declarations"
// Code that causes warning goes here
#pragma GCC diagnostic pop
The push/pop are used to preserve the diagnostic options that were in place before your code was processed.
This would be a much better approach than using "#pragma GCC system_header" to suppress all warnings. (Of course, in older gcc you may be "stuck" with the #pragma GCC system_header approach!)
Here's a nice reference on suppressing gcc warnings:
http://www.dbp-consulting.com/tutorials/SuppressingGCCWarnings.html
This page also describes how to use -fdiagnostics-show-option to find out what option controls a particular warning.
Of course, as others mention, it's generally far preferable to fix the root cause of all warnings than to suppress them! However, sometimes that is not possible.

Related

How to use GCC LTO with differently optimized object files?

I'm compiling an executable with arm-none-eabi-gcc for a Cortex-M4 based microcontroller. Non-performance-critical code is compiled with -Os (optimized for executable code size) and performance critical parts with another optimalization flags, eg. -Og / -O2 etc.
Is it safe to use -flto in such a build? If so, which optimalization flag should be passed to the linker?
According to the GCC documentation regarding optimise options:
It is recommended that you compile all the files participating in the same link with the same options
Such a statement is rather vague. Nevertheless, when digging into the release notes of GCC 5, there are some additional details:
Command-line optimization and target options are now streamed on a per-function basis and honored by the link-time optimizer. This change makes link-time optimization a more transparent replacement of per-file optimizations. It is now possible to build projects that require different optimization settings for different translation units (such as -ffast-math, -mavx, or -finline).
And also information about which flags are affected by such limitations and which aren't:
Note that this applies only to those command-line options that can be passed to optimize and target attributes. Command-line options affecting global code generation (such as -fpic), warnings (such as -Wodr), optimizations affecting the way static variables are optimized (such as -fcommon), debug output (such as -g), and --param parameters can be applied only to the whole link-time optimization unit. In these cases, it is recommended to consistently use the same options at both compile time and link time.
In your scenario, the optimisation flags -Og, -O2 and -Os can be passed as optimise attributes and do not fall into the cases where the compile time and link time flags ought to be the same. So yes, it should be safe to use -flto in such a build.
Regarding the optimisations flags passed at link time, as stated in the release notes:
Contrary to earlier GCC releases, the optimization and target options
passed on the link command line are ignored.
GCC automatically determines which optimisation level to use, which is the highest level used when compiling the object files. You therefore don't need to pass any of your -O optimisation options to the linker.

How does gcc's linktime optimisation (-flto flag) work

I understand more or less the idea: When compiling separate modules and producing assembly code, functions calling each other have to respect strictly the calling convention, which kills the opportunity for many optimisations when compiling separate modules.
For instance if I have function A which calls function B which calls function C, all 3 in their own separate source files, it becomes possible to allocate registers evenly within the functions so that no register saving on the stack is necessary at all during those calls. With traditional compile-assembly-linking this is not possible, as the caller-saved and callee-saved registers are imposed by the calling convention.
Another optimisation is to inline functions which are called only once. This previously was possible only if a function is local, but thanks to linktime optimisation it's now possible even if the function is in another source file.
Now, if I compile with both -flto and -S flags, I see that instead of normal assembly instructions, gcc generates an encoded representation of the program, such as this:
.section .gnu.lto_.inline.c3c5e6ef8ec983c,"dr0"
.ascii "x\234mQ;N\303#\20}\273\353\17\370C\234\20\242`\"!Q\20\11Ah\322&\25\242\314\231|\4\32\220\220(,$.#\205D\343\3P Z.\341Tn\231\35\274\31L\342\342\355\314\274\371<\317\30\354\376\356\365\357\333\7\262"
.ascii "1\240G\325\273\202\7\216\232\204\36\205"
.ascii "8\242\370\240|\222"
.ascii "8\374\21\205ty\352\"*r\340!:!n\357n%]\224\345\10|\304\23\342\274z\346"
.ascii "8\35\23\370\7\4\1\366s\362\203j\271]\27bb{\316\353\27\343\310\4\371\374\237*n#\220\342rA\31"
.ascii "7\365\263\327\231\26\364\10"
.ascii "2\\-\311\277\255^w\220}|\340\233\306\352\263\362Qo+e+\314\354\277\246\354\252\277\20\364\224%T\233'eR\301{\32\340\372\313\362\263\242\331\314\340\24\6\21s\210\243!\371\347\325\333&m\210\305\203\355\277*\326\236\34\300-\213\327\306\2Td\317\27\231\26tl,\301\26\21cd\27\335#\262L\223"
.ascii "8\353\30\351\264{I\26\316\11\14"
.ascii "9\326h\254\220B}6a\247\13\353\27M\274\231"
.ascii "0\23M\332\272\272%d[\274\36Q\200\37\321\1&\35"
Since the data is in its own particular section, the linker sees this, and does the code generation. If the module was written in either assembly or with no -flto flag, then the linker would see data in the .text section instead, so there is no confusion possible for the linker.
The problem is: How can the linker generate code? Normally only gcc can generate code, the linker's role is just here to change a few offsets and adapt the binary format. In order to generate code, the linker would need to contain a second copy of the entire gcc backend (half of the compiler which generates assembly code from intermediate representation), as well as the entire assembler (since no assembly code was produced). How is such a thing possible, especially considering that binutils is a completely separate entity from gcc, developed by different teams?
GCC's -flto emits a serialized form of GCC's internal representation, as you discovered.
Then, at link time, the linker reinvokes GCC and passes it the objects that need final compilation. GCC reads the internal representation and does the work.
I think the actual work is done in collect2, which is part of GCC that is used when invoking the linker (I'm a little fuzzy on the details). There is also a "linker plugin" system that enables this to work a little better (like letting the linker decide how to split the compilation). This is implemented at least by the binutils ld and by gold; but as far as I recall this is just an optimization and isn't needed to get the basic -flto feature to work. You can see a bit more information on the original LTO project page; and maybe links from there would explain more.
There is more overlap between the GCC and binutils teams than you might think. The two projects share some code and have a long history of working together. Some people work on both projects.
From https://gcc.gnu.org/wiki/LinkTimeOptimization:
Despite the "link time" name, LTO does not need to use any special
linker features. The basic mechanism needed is the detection of GIMPLE
sections inside object files. This is currently implemented in
collect2 [which is called by gcc; -ps]. Therefore, LTO will work on any linker already supported by
GCC.
I assume this means you must link calling the compiler driver gcc. Simply linking with the system's vanilla linker wouldn't optimize the whole program, as you already concluded.
Update:
https://gcc.gnu.org/onlinedocs/gccint/Collect2.html says
The program collect2 is installed as ld in the directory where the
passes of the compiler are installed. When collect2 needs to find the
real ld it tries the following file names: [...]
(The page goes on detailing how collect2 looks for configuration-dependent executables and ones with well-known names like real-ld, finally even ld; but will not call itself recursively.)

gcc; Aarch64; Armv8; enable crypto; -mcpu=cortex-a53+crypto

I am trying to optimize an Arm processor (Corte-A53) with an Armv8 architecture for crypto purposes.
The problem is that however the compiler accepts -mcpu=cortex-a53+crypto etc it doesn't change the output (I checked the assembly output).
Changing mfpu, mcpu add futures like crypto or simd, it doesn't matter, it is completely ignored.
To enable Neon code -ftree-vectorize is needed, how to make use of crypto?
(I checked the -O(1,2,3) flags, it won't help).
Edit: I realized I made a mistake by thinking the crypto flag works like an optimization flag solved by the compiler. My bad.
You had two questions...
Why does -mcpu=cortex-a53+crypto not change code output?
The crypto extensions are an optional feature under the AArch64 state of ARMv8-A. The +crypto feature flag indicates to the compiler that these instructions are available use. From a practical perspective, in GCC 4.8/4.9/5.1, this defines the macro __ARM_FEATURE_CRYPTO, and controls whether or not you can use the crypto intrinsics defined in ACLE, for example:
uint8x16_t vaeseq_u8 (uint8x16_t data, uint8x16_t key)
There is no optimisation in current GCC which will automatically convert a sequence of C code to use the cryptography instructions. If you want to make this transformation, you have to do it by hand (and guard it by the appropriate feature macro).
Why do the +fpu and +simd flags not change code output?
For -mcpu=cortex-a53 the +fp and +simd flags are implied by default (for some configurations of GCC +crypto may also be implied by default). Adding these feature flags will therefore not change code generation.

iar ewarm linking to gcc eabi build library

I have been able to build code in IAR EWARM (7.40) (for the ST STM32F407IG ARM Cortex-m4) which links to a library built under Ubuntu via gcc (4.9.3). This mostly works but some build environment adjustments on either or both the IAR or gcc side still remain. I would appreciate whatever help you can point me to.
There are no build errors evident but EWARM and arm-none-eabi-gcc disagree on the locations of parameters being passed to the gcc built library. The EWARM debugger and the code generated by EWARM agree with each other but (it appears given investigations so far) that the locations expected by the gcc generated code are offset from those expected by EWARM by eight bytes. I've only investigated a single call, so this may not be constant...
IAR's compiler flags include: --aeabi and --guard_calls as per section: "AEABI compliance" in the EWARM help section.
arm-none-eabi-gcc compiler flags include: -gdwarf-3 -mabi=aapcs -march=armv7e-m -mthumb.
I believe this tells both EWARM and gcc to play nice together with ARM AAPCS standard procedure calls and dwarf v3 formats.
EWARM does seem to be happy with either -gdwarf-2 or -gdwarf-3 (but not -4). This selection does not appear to affect the issue discussed above.
What else is required?
The answer to "What else is required?" appears to be nothing. Just be darn sure that all of the macros evaluated by #ifdef statements match in the environments so you don't end up with different sized data structures in the two different environments! #ifdef code is header files should be carefully evaluated...

gcc 4.4.4 optimization error only with O1 OR O2+no-strict-aliasing

Post-Answer-Acceptance Summary: The problem was the use of a pointer to a stack variable that had gone out of scope. It had nothing to do with optimization. It is a pity that valgrind can't find stack errors...
I have a segfault that appears only when enabling -O1 level optimization in gcc 4.4.4 (CentOS 5.5). All other optimization levels (0,2,3,s) are fine. I haven't managed to create a reduced test case for it yet, but it appears to be related to an array offset calculation causing the stack to be overwritten.
If I enable -O1 and disable all optimizations that have a flag (http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) the bug still occurs.
If I use -O2 (or any other level) there is no problem. If I use O2 and disable strict-aliasing with -fno-strict-aliasing then the segfault returns.
Edit: If I add -fstack-protector-all to the build flags (either O1 or O2 -fno-strict-aliasing) the segfault disappears.
So it appears to be caused by an optimization that happens by default in O1 that is disabled by strict-aliasing.
I suspect that this is a compiler bug (but without a reduced testcase I can't prove it). This is a production server that needs a quick turn around. The normal optimization level is O1 and I'm loathe to just change it to O2 as it seems that the fix might be more dangerous than the original problem.
I would really appreciate some suggestions. Currently I'm thinking to try compiling gcc 4.4.6 and seeing if that fixes it. However not knowing for sure what is causing the problem is a little worrying.
Edit: the server is compiled with -Wall -Werror (and a few others). It runs without error in valgrind (valgrind checks heap accesses and this appears to be a stack related error).
Often, compiler optimizations can expose invalid or undefined behavior in source code, that you are lucky to get to work otherwise. A few things I would try:
compiling with -Wall -Wextra
running under valgrind to see if you can get more of a hint of where the error is
finding that minimum testcase!

Resources