I am trying to understand how to turn off specific optimisation flags compiling with GCC. I understand that some flags have a -fno option, but most flags don't (from what I have seen). I am trying to compile a program with -O1 flags but remove one of the flags in -O1 for each compile.
For instance; -fauto-inc-dec does not have an equivalent -fno-auto-inc-dec flag that I could pass into the arguments like: -O1 -fno-auto-inc-dec.
Want to compile with -O1 options but turn off specific options given by -O1 to see the difference that causes.
Any help will be appreciated, unfortunately I'm new to this so I'm very much a beginner.
As stated in man gcc:
Most optimizations are only enabled if an -O level is set on
the command line. Otherwise they are disabled,
even if individual optimization flags are specified.
So basically by not passing any -O flags you aren't using configurable optimizations.
Also, -O1 is not the default, -O0 is.
You could also go from the opposite, disable all optimizations and enable "batches" by hand, i.e. have a look at gcc -Q --help=optimizers, see what optimizations are enabled at which level and strip those.
To address your concern that -O* options enable flags that aren't listed, I'd say that it's a man-page thing. Actively querying compiler on a particular architecture should give you an exhaustive list of optimization that will be enabled with a particular -O flag, so using -O0 in combination with the list of those flags should produce exactly the same result.
why not go the other way round? turn off all optimization with -O0 and enable them selectively.
or if you prefer disabling them one by one, start with:
CFLAGS=-O0 \
-fauto-inc-dec \
-fcompare-elim -fcprop-registers \
-fdce -fdefer-pop -fdelayed-branch -fdse \
-fguess-branch-probability \
-fif-conversion2 -fif-conversion \
-fipa-pure-const -fipa-profile -fipa-reference \
-fmerge-constants \
-fsplit-wide-types \
-ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch \
-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse \
-ftree-forwprop -ftree-fre -ftree-phiprop -ftree-slsr -ftree-sra \
-ftree-pta -ftree-ter \
-funit-at-a-time
(btw, all of this information is distilled from man gcc)
Related
I installed OCaml via OPAM, and by default it uses gcc as the command to compile .c files. For instance, if I run ocamlopt -verbose file.c, I obtain:
+ gcc -Wall -D_FILE_OFFSET_BITS=64 -D_REENTRANT -g
-fno-omit-frame-pointer -c -I'/home/user/.opam/4.02.1+fp/lib/ocaml' 'test.c'
I'd like to change the GCC binary that is used by OCaml, for instance to replace it with gcc-5.1 or /opt/my-gcc/bin/gcc.
Is it possible to do so without reconfiguring and recompiling OCaml? I suppose I could add a gcc alias to a directory in the PATH, but I'd prefer a cleaner solution if there is one.
To check if gcc was not chosen based on a textual configuration file (that I could easily change), I searched for occurrences of gcc in my /home/user/.opam/4.02.1+fp directory, but the only occurrence in a non-binary file that I found was in lib/ocaml/Makefile.config, and changing it does nothing for the already-compiled binary.
ocamlopt uses gcc for three things. First, for compiling .c files that appear on the command line of ocamlopt. Second, for assembling the .s files that it generates internally when compiling an OCaml source file. Third, for linking the object files together at the end.
For the first and third, you can supply a different compiler with the -cc flag.
For the second, you need to rebuild the OCaml compiler.
Update
Here's what I see on OS X when compiling a C and an OCaml module with the -verbose flag:
$ ocamlopt -verbose -cc gcc -o m m.ml c.c 2>&1 | grep -v warning
+ clang -arch x86_64 -c -o 'm.o' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlasm461f1b.s' \
+ gcc -c -I'/usr/local/lib/ocaml' 'c.c'
+ clang -arch x86_64 -c -o \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartup695941.o' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartupb6b001.s'
+ gcc -o 'm' '-L/usr/local/lib/ocaml' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartup695941.o' \
'/usr/local/lib/ocaml/std_exit.o' 'm.o' \
'/usr/local/lib/ocaml/stdlib.a' 'c.o' \
'/usr/local/lib/ocaml/libasmrun.a'
So, the compiler given by the -cc option is used to do the compilation of the .c file and the final linking. To change the handling of the .s files you need to rebuild the compiler. I'm going to update my answer above.
I made a simple quick sort algorithm using C language, named test.c
I'm trying to maximize the optimization, so I use -O3 options like belows.
gcc -S -O3 -o test.s test.c
gcc -S -O3 -o test1.s test.s
gcc -S -O3 -o test2.s test1.s
gcc -S -O3 -o test3.s test2.s
.
.
.
But strange thing happens. The more times I did above procedure, the more number of line assembly get.
I don't know why this happens, because I think that I have to get more optimized assembly file that has smaller number of line as I did above procedure.
If this is not right way, using -O3 only one time is the way of the best optimization?
Thanks
Most of the gcc optimizations operate on the representation of C source code in an intermediate language. I'm not aware of any optimization specifically operating at the assembler instruction level other than peephole. But that would also be included in -O3.
So yes, -O3 is supposed to be used only once, when turning C source into object files.
I am working on a ARM Cortex A15 and using GCC compile (actually integrating it with TI's SYS/BIOS using XDC tools...)
After I enable -ftlo flag, I am having a performance loss about %30, which is a significant value. I am doing simple benchmark tests like pi and prime number calculating and also system dependent procedural tests.
Below are my compile and link flags. Is this amount of downgrade possible without any errors? Is there a possible cause for this? From what I searched through the internet, I come across benchmarks that flto may not improve the performance but I didn't see such a performance loss...
# Compile options.
C_OPTS = -w\
-mcpu=cortex-a15 \
-mtune=cortex-a15 \
-mabi=aapcs \
-mapcs \
-mfpu=neon \
-mfloat-abi=hard \
-O3 \
-flto \
-fno-strict-aliasing \
-fno-delete-null-pointer-checks \
-fno-strict-overflow \
# Linker options.
L_OPTS = -nostartfiles \
-static \
-Wl,--gc-sections \
-Wl,-Map,$(BUILD_DIR)/$(NAME).map \
-mfloat-abi=hard \
-e wbcd_ep \
-flto \
-fuse-linker-plugin \
I have small program that performs much better when compiled with -O1 as opposed to no optimisation. I am interested in knowing what optimisation(s) done by the compiler is leading to this speedup.
What I thought I would do is to take the list of optimisation flags that -O1 is equivalent to (got both from the man page and from gcc -Q -v) and then to pick away at the list to see how the performance changes.
What I have found is that even including the whole list of optimisations still does not give me a program that performs as well as an -O1 optimised one.
In other words
gcc -O0 -fcprop-registers -fdefer-pop -fforward-propagate -fguess-branch-probability \
-fif-conversion -fif-conversion2 -finline -fipa-pure-const -fipa-reference \
-fmerge-constants -fsplit-wide-types -ftoplevel-reorder -ftree-ccp -ftree-ch \
-ftree-copy-prop -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse \
-ftree-fre -ftree-sink -ftree-sra -ftree-ter myprogram.c
is not the same as
gcc -O1 myprogram.c
I am using gcc version 4.5.3
Is there something else that -O1 does that isn't included in the list of optimisation flags associated with -O1 in the manual?
How about using -S option to check the produced assembler?
From two experiments using also "my_program.c" it seems, that -O0 option disables all optimizations regardless of the long list of suggested algorithms.
This is expected, not a bug:
https://gcc.gnu.org/wiki/FAQ#optimization-options
Is there something else that -O1 does that isn't included in the list of optimisation flags associated with -O1 in the manual?
Yes, it turns on optimization. Specifying individual -fxxx flags doesn't do that.
If you don't use one of the -O1, -O2, -O3, -Ofast, or -Og optimization options (and not -O0) then no optimization happens at all, so adjusting which optimization passes are active doesn't do anything.
To find which optimization pass makes the difference you can turn on -O1 and then disable individual optimization passes until you find the one that makes a difference.
i.e. instead of:
gcc -fxxx -fyyy -fzzz ...
Use:
gcc -O1 -fno-xxx -fno-yyy -fno-zzz ...
I use a custom build tool to compile go projects and I need a way to use cgo in my project.
The problem is that the cgo documentation only tells you how to use it with make.
What I really need to know is which generated files to process with which tools and in what order it needs to be done. I tried to read make.pkg in the go source dir but my best effort fails.
My test dll is very simple, a single function that returns 1 every time it is called and the go code to use this function is similarly simple.
The output from the console produced by a successful run of make on a cgo project would be very helpful.
Output of running make on 32-bit Linux in directory misc/cgo/life:
# gomake _obj/life.a
CGOPKGPATH= cgo -- life.go
touch _obj/_cgo_run
8g -o _go_.8 _obj/life.cgo1.go _obj/_cgo_gotypes.go
8c -FVw -I ${GOROOT}/pkg/linux_386 -I . -o "_cgo_defun.8" _obj/_cgo_defun.c
gcc -m32 -I . -g -fPIC -O2 -o _cgo_main.o -c _obj/_cgo_main.c
gcc -m32 -g -fPIC -O2 -o c-life.o -c c-life.c
gcc -m32 -I . -g -fPIC -O2 -o life.cgo2.o -c _obj/life.cgo2.c
gcc -m32 -I . -g -fPIC -O2 -o _cgo_export.o -c _obj/_cgo_export.c
gcc -m32 -g -fPIC -O2 -o _cgo1_.o _cgo_main.o c-life.o life.cgo2.o _cgo_export.o
cgo -dynimport _cgo1_.o >_obj/_cgo_import.c_ && mv -f _obj/_cgo_import.c_ _obj/_cgo_import.c
8c -FVw -I . -o "_cgo_import.8" _obj/_cgo_import.c
rm -f _obj/life.a
gopack grc _obj/life.a _go_.8 _cgo_defun.8 _cgo_import.8 c-life.o life.cgo2.o _cgo_export.o
The line cgo -- life.go creates the following files:
_obj/_cgo_.o
_obj/life.cgo1.go
_obj/life.cgo2.c
_obj/_cgo_gotypes.go
_obj/_cgo_defun.c
_obj/_cgo_main.c
_obj/_cgo_flags
_obj/_cgo_export.c
_cgo_export.h
"I use a custom build tool to compile go projects and I need a way to use cgo in my project."
... and this approach leads to problems. Using the standard way with a Makefile is simple, easy, proven, documented, etc.
I realize I'm not (directly) answering your question. Instead my "answer" is: I strongly suggest to use the standard way. Don't create problems for your self by choosing other, not directly supported options.
That said, I think there is a way to avoid the Makefiles, I just never been there, sorry. I'm usually lazy/short of time, so I use the simplest/fastest way to get things done. You might want to try the same ;-)