I just discovered the nice "-Minfo=" flag in pgcc, which outputs all the optimizations that the compiler is making.
IE:
pgcc -c -pg -O3 -Minfo=all -Minline -c -o example.o example.c
run:
55, Memory zero idiom, loop replaced by call to __c_mzero8
91, Memory zero idiom, loop replaced by call to __c_mzero8
pgcc -c -pg -O3 -Minfo=all -Minline -c -o controller.o controller.c
main:
82, second inlined, size=4, file controller.c (113)
84, second inlined, size=4, file controller.c (113)
is there an equivalent compiler flag for GCC?
Yes there is. -fopt-info is what you are looking for.
gcc -O3 -fopt-info example.c -o example
Or equivalently you can do
gcc -O3 -fopt-info-all=all.dat example.c -o example
Will output all the optimization information to file all.dat. You can also be specific about which optimization information you want by specifying -fopt-info-options like so:
-fopt-info-loop # info about all loop optimizations
-fopt-info-vec # info about auto-vectorization
-fopt-info-inline # info about function inlining
-fopt-info-ipa # info about all interprocedural optimizations
You can get more specific if you want by telling gcc to dump information only about loops/inlinings/vectorizations that were optimized or were missed
-fopt-info-inline-optimized # info only about functions that were inlined
-fopt-info-vec-missed # info only about vectorizations that were missed
-fopt-info-loop-note # verbose info about loop optimization
For more details look at the online documentation.
Related
I made a simple quick sort algorithm using C language, named test.c
I'm trying to maximize the optimization, so I use -O3 options like belows.
gcc -S -O3 -o test.s test.c
gcc -S -O3 -o test1.s test.s
gcc -S -O3 -o test2.s test1.s
gcc -S -O3 -o test3.s test2.s
.
.
.
But strange thing happens. The more times I did above procedure, the more number of line assembly get.
I don't know why this happens, because I think that I have to get more optimized assembly file that has smaller number of line as I did above procedure.
If this is not right way, using -O3 only one time is the way of the best optimization?
Thanks
Most of the gcc optimizations operate on the representation of C source code in an intermediate language. I'm not aware of any optimization specifically operating at the assembler instruction level other than peephole. But that would also be included in -O3.
So yes, -O3 is supposed to be used only once, when turning C source into object files.
In a particular project, I saw the following compiler options used all at once:
gcc foo.c -o foo.o -Icomponent1/subcomponent1 -Icomponent2/subcomponent1 -Wall -fPIC -s
Are the -fPIC and -s used together contradictory here? If not, why?
-s and -fPIC are two flags used for different purposes. They are not contradictory.
From the gcc manual
-s
Remove all symbol table and relocation information from the executable.
-fPIC
If supported for the target machine, emit position-independent code, suitable for dynamic linking and avoiding any limit on the size of the global offset table. This option makes a difference on the m68k, PowerPC and SPARC.
I use the following LLVM tools to convert a cpp project which is written in multiple files into "ONE" single assembly file.
clang *.cpp -S -emit-llvm
llvm-link *.s -S -o all.s
llc all.s -march=mips
Is there any way of doing this in GCC? In particular, is there any way of linking GCC generated assembly files into one assembly file? i.e., what is the equivalent of LLVM-LINK?
Perhaps LTO (Link Time Optimization) is what you want.
Then, compile each compilation unit with gcc -flto e.g.
gcc -flto -O -Wall -c src1.c
g++ -flto -O -Wall -c src2.cc
and use also -flto (and the same optimizations) to link them:
g++ -flto -O src1.o src2.o -lsomething
LTO works in GCC by putting, in each generated assembly file and object file, some representation of the internal GCC representations (like Gimple). See its documentation
You might want to use MELT to customize GCC (or simply use its probe to understand the Gimple, or try just gcc -fdump-tree-all).
From the GCC manual, there is the following overall option:
-wrapper
Invoke all subcommands under a wrapper program.
The name of the wrapper program and its parameters
are passed as a comma separated list.
gcc -c t.c -wrapper gdb,--args
This will invoke all subprograms of gcc under gdb --args', thus
the invocation of cc1 will begdb --args cc1 ...'.
I'm having trouble understanding the example and the purpose of the flag.
gcc -c t.c will create a t.o.
and then what? the object file is sent to gdb?
or is gdb given the responsibility of creating the object file (asummingly adding debugging information)?
Yes, for debugging the compiler itself. Or otherwise "trace" what is going on in the compiler - you could for example print the arguments passed to cc1 itself by adding a program that does that and then runs cc1.
gdb is not in charge of generating anything, it is just wrapping around cc1 whihc is the "compiler proper" - when you run gcc -c t.c the compiler first runs cpp -o t.i t.c to preprocess the t.c file. Then it runs cc1 -o t.s t.i and finally as -o t.o t.s (or something along those lines. With the wrapper, it would run those commands as, for example, gdb --args cc1 -o t.s t.i.
Edit: This is of course much simplified compared to a "real" compile - there's a whole bunch of arguments passed to cc1, etc.
During compilation gcc invokes some other programs (actual assembler, linker etc), and with -wrapper flag they are invoked within said wrapper. In your example, all subcommands are executed within gdb, which is useful for debugging gcc.
Another example: to get list of all invoked subcommands one can wrap them within echo (of course, they are not executed this way):
$ gcc 1.c -wrapper echo
/usr/lib/gcc/x86_64-linux-gnu/4.6/cc1 -quiet -imultilib . -imultiarch x86_64-linux-gnu 1.c -quiet -dumpbase 1.c -mtune=generic -march=x86-64 -auxbase 1 -fstack-protector -o /tmp/cc7cQrsT.s
as --64 -o /tmp/ccaLYkv9.o /tmp/cc7cQrsT.s
/usr/lib/gcc/x86_64-linux-gnu/4.6/collect2 --sysroot=/ --build-id --no-add-needed --as-needed --eh-frame-hdr -m elf_x86_64 --hash-style=gnu -dynamic-linker /lib64/ld-linux-x86-64.so.2 -z relro /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crt1.o /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/4.6/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.6/../../.. /tmp/ccaLYkv9.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/4.6/crtend.o /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../x86_64-linux-gnu/crtn.o
You could have tried it on a simple hello world.
gcc will call different subcommands. Each of these subcommands will be prefixed with the wrapper. Giving gdb as a wrapper means that you want to debug the compiler.
I am reading a Makefile from someone else as follows.
LDFLAGS=-lm -ljpeg -lpng
ifeq ($(DEBUG),yes)
OPTIMIZE_FLAG = -ggdb3 -DDEBUG -fno-omit-frame-pointer
else
OPTIMIZE_FLAG = -ggdb3 -O3
endif
CXXFLAGS = -Wall $(OPTIMIZE_FLAG)
all: test
test: test.o source1.o source2.o
$(CXX) $(CXXFLAGS) -o $# $^ $(LDFLAGS)
Makefile.depend: *.h *.cc Makefile
$(CC) -M *.cc > Makefile.depend
clean:
\rm -f test *.o Makefile.depend
-include Makefile.depend
Here are my questions:
Although not explicitly, is $(CXXFLAGS) used by the implicit rule not shown in this Makefile during compilation to generate object files?
I am also wondering why $(CXXFLAGS) appears in the linkage stage? I think it is only for compilation stage? Can I remove $(CXXFLAGS) from "$(CXX) $(CXXFLAGS) -o $# $^ $(LDFLAGS)"? If I am wrong, does it mean g++ also generate debugging info and doing optimization at linkage?
why use -ggdb3 -O3 together for nondebugging purpose? What whould its purpose be? If merely considering to improve speed, then isn't using -O3 only better?
for debugging purpose, how using -ggdb3 -fno-omit-frame-pointer together will do better than -ggdb3 alone? I trying to understand the purpose of -fno-omit-frame-pointer by reading gcc document but still confused.
can I move " -include Makefile.depend" to be just under "$(CC) -M *.cc > Makefile.depend" and above "clean"? Does its position in Makefile matter?
CXXFLAGS is used by the implicit rule. Do make -n -p to see the complete list of variables and rules that Make generates from its internal rule sets and the make file.
-g debugging options are only used at the compile stage. -O options are used at the compile and link stages, potentially. You can use both together. From my gcc man page:
GCC allows you to use -g with -O.
The shortcuts taken by optimized code
may occasionally produce surprising
results:
some variables you declared may not exist at all; flow of control
may briefly move where you did not
expect it; some
statements may not be executed because they compute constant
results or their values were already
at hand; some state-
ments may execute in different places because they were
moved out of loops.
Nevertheless it proves possible to debug optimized output.
This makes it reasonable to use the
optimizer for programs that might have bugs.
-fno-omit-frame-pointer actually reduces the optimization. The optimization can interfere with debugging and the person who wrote this makefile clearly intended to run the debugger on optimized code. If it's unclear to you from the gcc manual how it works, you might want to sit down some Saturday with the Intel or AMD architecture reference manual and learn about function calls and argument passing at the instruction level. (Then again, maybe not. :) )
Position matters in Makefiles. I would leave that include file at the end. To do otherwise risks breaking the include file dependency checking.