Performance loss with gcc linker optimization flag -flto - gcc

I am working on a ARM Cortex A15 and using GCC compile (actually integrating it with TI's SYS/BIOS using XDC tools...)
After I enable -ftlo flag, I am having a performance loss about %30, which is a significant value. I am doing simple benchmark tests like pi and prime number calculating and also system dependent procedural tests.
Below are my compile and link flags. Is this amount of downgrade possible without any errors? Is there a possible cause for this? From what I searched through the internet, I come across benchmarks that flto may not improve the performance but I didn't see such a performance loss...
# Compile options.
C_OPTS = -w\
-mcpu=cortex-a15 \
-mtune=cortex-a15 \
-mabi=aapcs \
-mapcs \
-mfpu=neon \
-mfloat-abi=hard \
-O3 \
-flto \
-fno-strict-aliasing \
-fno-delete-null-pointer-checks \
-fno-strict-overflow \
# Linker options.
L_OPTS = -nostartfiles \
-static \
-Wl,--gc-sections \
-Wl,-Map,$(BUILD_DIR)/$(NAME).map \
-mfloat-abi=hard \
-e wbcd_ep \
-flto \
-fuse-linker-plugin \

Related

How is this gcc instruction being interpreted?

I downloaded libopencm3 (https://github.com/libopencm3/libopencm3) library and compiled it. It worked. I found a small project that uses this library and copied the instructions from its makefile.
all:
arm-none-eabi-gcc \
-Os \
-ggdb3 \
-mthumb \
-mcpu=cortex-m0 \
-msoft-float \
-Wall \
-Wextra \
-Wundef \
-Wshadow \
-Wredundant-decls \
-fno-common \
-ffunction-sections \
-fdata-sections \
-std=c11 \
-MD \
-DSTM32F0 \
-I./libopencm3/include \
-o main.o \
-c main.c
arm-none-eabi-gcc \
--static \
-nostartfiles \
-Tstm32f0.ld \
-mthumb \
-mcpu=cortex-m0 \
-msoft-float \
-ggdb3 \
-Wl,-Map=main.map \
-Wl,--cref \
-Wl,--gc-sections \
-L./libopencm3/lib \
main.o \
-lopencm3_stm32f0 \
-Wl,--start-group -lc -lgcc -lnosys -Wl,--end-group \
-o main.elf
I created a folder for the project, pasted libopencm3 folder inside it and compiled. It's working, but I don't understand how this part works:
-L./libopencm3/lib main.o -lopencm3_stm32f0
If I am right, it is instructing to find opencm3_stm32f0 library inside /libopencm3/lib, but inside that folder I found libopencm3_stm32f0.a instead.
I want to know why they changed the name and omitted the extension and it still worked.
This isn't related to makefiles or to any specific library, or to embedded systems, so those tags on your question are not needed.
If you look up the -l option in the documentation of the linker, you'll understand how it works.

linking error interfacing ocaml for windows with external c library

I am developing a project in ocaml that requires me to interface it with the OGDF external c++ library. It is all up and running on my mac, but now I am trying to create a windows version using Ocaml for Windows (https://fdopen.github.io/opam-repository-mingw/), the MinGW Cygwin port of Ocaml. In this version, I can interface ocaml with c code and it works fine, but as soon as I try to include an external library in that c code I get errors from the linker, which is flexdll (https://github.com/alainfrisch/flexdll) in this case. The linker says it cannot resolve symbols for _Unwind_Resume and __emutls_get_address throughout the library.
Here is a toy example:
My .ml file t.ml:
external print : unit -> unit = "print"
let () =
Printf.printf "platform: %s\n" (Sys.os_type);
print ()
My .cpp file tc.cpp:
#include <stdio.h>
#include "caml/mlvalues.h"
#define CAML_NAME_SPACE
//#include <ogdf/basic/Graph.h>
extern "C" value print(value unused) {
printf("hello from C\n");
return Val_unit;
}
My makefile:
t.exe: t.ml tc.o
ocamlopt -verbose -ccopt -pthread \
-cclib -lstdc++ -w s \
-ccopt -L../cdeg/ogdf/_release \
-cclib -lOGDF \
tc.o t.ml \
-o t.exe
tc.o: tc.cpp
x86_64-w64-mingw32-gcc -c \
-march=x86-64 -mtune=generic -O2 -mms-bitfields -Wall -Wno-unused \
tc.cpp \
-I../cdeg/ogdf -L../cdeg/ogdf/_release -lOGDF \
-I ~/.opam/4.04.0+mingw64c/lib/ocaml \
-lstdc++ -pthread -o tc.o
Like this, it all compiles happily, but if I uncomment the ogdf include line in tc.cpp, I get the following output:
$ make
x86_64-w64-mingw32-gcc -c \
-march=x86-64 -mtune=generic -O2 -mms-bitfields -Wall -Wno-unused \
tc.cpp \
-I../cdeg/ogdf -L../cdeg/ogdf/_release -lOGDF \
-I ~/.opam/4.04.0+mingw64c/lib/ocaml \
-lstdc++ -pthread -o tc.o
ocamlopt -verbose -ccopt -pthread \
-cclib -lstdc++ -w s \
-ccopt -L../cdeg/ogdf/_release \
-cclib -lOGDF \
tc.o t.ml \
-o t.exe
+ x86_64-w64-mingw32-as -o "t.o" "C:\OCaml64\tmp\camlasme5f9bd.s"
+ x86_64-w64-mingw32-as -o "C:\OCaml64\tmp\camlstartupf2b3f1.o" "C:\OCaml64\tmp\camlstartup101e51.s"
+ flexlink -chain mingw64 -stack 33554432 -exe -o "t.exe" "-LC:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml" -pthread -L../cdeg/ogdf/_release "C:\OCaml64\tmp\camlstartupf2b3f1.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\std_exit.o" "t.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\stdlib.a" "-lstdc++" "-lOGDF" "tc.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\libasmrun.a" -lws2_32
** Cannot resolve symbols for ../cdeg/ogdf/_release\libOGDF.a(PoolMemoryAllocator.o/
PreprocessorLayout.o/
extended_graph_alg.o/
graph_generators.o/
random_hierarchy.o/
simple_graph_alg.o/
CPlanarEdgeInserter.o/
... [a bunch of other .o files from the library]...
UpwardPlanarModule.o/
UpwardPlanarSubgraphModule.o/
UpwardPlanarSubgraphSimple.o/
VisibilityLayout.o/
):
_Unwind_Resume
__emutls_get_address
** Cannot resolve symbols for ../cdeg/ogdf/_release\libOGDF.a(basic.o):
_Unwind_Resume
File "caml_startup", line 1:
Error: Error during linking
make: *** [makefile:20: t.exe] Error 2
If I don't connect it to ocaml, but instead add a main() function to t.c it compiles just fine under x86_64-w64-mingw32-gcc with the external library included. I've tried including a few other small external libraries and they didn't cause this problem.
My first thought was that maybe the problem had to do with the linked files not all being compiled the same way, but I compiled the library and the .cpp file with the compiler and options given by ocamlopt -configure. And if they weren't all compiled the same way, I wouldn't expect to be able to get tc.cpp to work individually with ocamlopt and with the external library, but I only get errors when I try to use both. So is this an issue with Ocaml for windows, or flexdll, or with my installation of one of these? I'm at a loss for what to try next, and any ideas, suggestions, and/or explanations of what is going on here would be very much appreciated.
I have a partial answer. The issue is coming somehow from flexdll. I switched to using the Cygwin version of ocaml with gcc, and still had the same problem. Then I recompiled ocaml configured with the -no-shared-libs flag, which makes ocamlopt link with gcc instead of flexdll, and now everything compiles.

make/gcc/clang looking for file with a blank filename, gives error

I have OSX 10.11.4, Xcode 7.3.1.
Using make -- calls gcc -- calls clang, I get this error where clang is looking for a file whose name is a space!
Make error 1: clang: error: no such file or directory: ' '.
That is a space!
I have no idea how to fix this. The makefile formatting is correct.
Here is the end of the output from make:
gcc -g -v -Wall -I/usr/local/include -I/opt/local/include -I/Users/m/BioPrep \
-o mod \
../mshell/runit0.o \
../mshell/tline.o \
../mshell/getshm.o \
../mshell/callLSODA.o \
../mshell/extras.o \
../mshell/nrutil.o \
../mshell/exten.o \
../choosedisp/choosedisp_main.o \
../choosedisp/choosedisp_cb.o \
../choosedisp/choosedisp_fm.o \
../connectdisps/connectdisps.o \
../connectdisps/opwsock.o \
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
clang: error: no such file or directory: ' '
make: *** [mod] Error 1
======================= Here is the contents of the Makefile:
# This is Makefile with all graphics code removed so that a compilation of mod can proceed
# "MakefileA" has the graphics files present but commented (#) out
CC = gcc
CFLAGS = -g -v -Wall
INCL = -I/usr/local/include -I/opt/local/include -I/Users/prowat/BioPrep
LIBS = -L/usr/local/lib -L/opt/local/lib -lX11 -lforms
mod: model.o \
../mshell/runit0.o \
../mshell/tline.o \
../mshell/getshm.o \
../mshell/callLSODA.o \
../mshell/extras.o \
../mshell/nrutil.o \
../mshell/exten.o \
../choosedisp/choosedisp_main.o \
../choosedisp/choosedisp_cb.o \
../choosedisp/choosedisp_fm.o \
../connectdisps/connectdisps.o \
../connectdisps/opwsock.o \
../lsoda/liblsoda.a`
cd ../mshell; make objs`
cd ../choosedisp; make objs`
cd ../connectdisps; make objs`
$(CC) $(CFLAGS) $(INCL) \
-o mod \
../mshell/runit0.o \
../mshell/tline.o \
../mshell/getshm.o \
../mshell/callLSODA.o \
../mshell/extras.o \
../mshell/nrutil.o \
../mshell/exten.o \
../choosedisp/choosedisp_main.o \
../choosedisp/choosedisp_cb.o \
../choosedisp/choosedisp_fm.o \
../connectdisps/connectdisps.o \
../connectdisps/opwsock.o \
model.o \
-L../lsoda -llsoda \
$(LIBS) \
-lm
=====================
Please use proper formatting: for code blocks, indent by 4 spaces. The backticks are only used for fixed-width fonts inside normal text.
You may have looked for spaces, but you need to look again. Based on the output make has shown you versus your makefile, it's pretty clear that there is at least one space after the backslash at the end of this line:
../connectdisps/opwsock.o \
You can tell this because if there were no spaces after this backslash then it would be continuing to the next line and make whould show the rest of the compile line:
model.o \
-L../lsoda -llsoda \
$(LIBS) \
-lm
Since those lines are missing from the output make provided, you can be sure that there's something about the opwsock.o line which is preventing make from recognizing the backslash/newline at the end.

How do I compile VASP 5.3.5 on Cray XC?

I need to compile VASP 5.3.5 on a Cray XC MPP machine. The system has the Gnu, Intel and Cray compiling environments available. There is also a choice of Intel MKL or Cray LibSci for BLAS, LAPACK, ScaLAPCK.
Which is the best compiler to use, the best numerical library to use and the Makefile options to use?
Tests on ARCHER (http://www.archer.ac.uk) have shown that using the Intel compiler with Intel MKL and FFTW produce the best performance and the most stable build of VASP 5.3.5 on the Cray XC30 system.
Full compilation instructions can be found at:
http://www.archer.ac.uk/documentation/software/vasp/compiling_5-3-5-phase2.php
Briefly, the procedure is:
module swap PrgEnv-cray PrgEnv-intel
module load fftw
module load cray-pe-hugepages2M
Modify the library makefile to have the following options:
CPP = gcc -E -P -C $*.F >$*.f
FC=ftn
CFLAGS = -O3
FFLAGS = -O3 -unroll -ip -no-prec-div -xAVX
FREE = -free
Build the library (assuming makefile is called "makefile.cray_xc_intel.lib"):
cd vasp.5.lib
make -f makefile.cray_xc_intel.lib
Move to the main source code directory:
cd ../vasp.5.3
Setup the preprocessor options in the Makefile (this is for the multiple K-points version):
CPP = $(CPP_) -DMPI -DHOST=\"CrayXC-Intel\" \
-DNGZhalf \
-DLONGCHAR \
-Dkind8 \
-DCACHE_SIZE=2000 \
-Davoidalloc \
-DRPROMU_DGEMV \
-DMPI_BLOCK=100000 \
-Duse_collective \
-Drandom_array \
-DscaLAPACK
Set the makefile compilation options:
FC=ftn
FCL=$(FC)
CPP_ = ./preprocess <$*.F | cpp -P -C -traditional >$*$(SUFFIX)
FFLAGS = -free -march=corei7-avx -assume byterecl -m64
OFLAG = -O3 -ip -fno-alias -unroll-aggressive -opt-prefetch -use-intel-optimized-headers -no-prec-div
OFLAG_LOW = -O1 -g -ftz
OBJ_LOW = broydon.o
Set the makefile linear algebra library options for Intel MKL:
MKL_PATH = $(MKLROOT)/lib/intel64
BLAS=
LAPACK=
BLACS=
SCA=
LIB = ../vasp.5.lib/linpack_double.o -L../vasp.5.lib -ldmy \
${MKL_PATH}/libmkl_blas95_lp64.a ${MKL_PATH}/libmkl_lapack95_lp64.a \
${MKL_PATH}/libmkl_scalapack_lp64.a \
-Wl,--start-group ${MKL_PATH}/libmkl_intel_lp64.a \
${MKL_PATH}/libmkl_sequential.a ${MKL_PATH}/libmkl_core.a \
${MKL_PATH}/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm
Finally, set the makefile options for linking FFTW:
FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
Now build the code (assuming makefile is called "makefile.cray_xc_intel"):
make -f makefile.cray_xc_intel

How to turn off ANY optimisation flag in GCC

I am trying to understand how to turn off specific optimisation flags compiling with GCC. I understand that some flags have a -fno option, but most flags don't (from what I have seen). I am trying to compile a program with -O1 flags but remove one of the flags in -O1 for each compile.
For instance; -fauto-inc-dec does not have an equivalent -fno-auto-inc-dec flag that I could pass into the arguments like: -O1 -fno-auto-inc-dec.
Want to compile with -O1 options but turn off specific options given by -O1 to see the difference that causes.
Any help will be appreciated, unfortunately I'm new to this so I'm very much a beginner.
As stated in man gcc:
Most optimizations are only enabled if an -O level is set on
the command line. Otherwise they are disabled,
even if individual optimization flags are specified.
So basically by not passing any -O flags you aren't using configurable optimizations.
Also, -O1 is not the default, -O0 is.
You could also go from the opposite, disable all optimizations and enable "batches" by hand, i.e. have a look at gcc -Q --help=optimizers, see what optimizations are enabled at which level and strip those.
To address your concern that -O* options enable flags that aren't listed, I'd say that it's a man-page thing. Actively querying compiler on a particular architecture should give you an exhaustive list of optimization that will be enabled with a particular -O flag, so using -O0 in combination with the list of those flags should produce exactly the same result.
why not go the other way round? turn off all optimization with -O0 and enable them selectively.
or if you prefer disabling them one by one, start with:
CFLAGS=-O0 \
-fauto-inc-dec \
-fcompare-elim -fcprop-registers \
-fdce -fdefer-pop -fdelayed-branch -fdse \
-fguess-branch-probability \
-fif-conversion2 -fif-conversion \
-fipa-pure-const -fipa-profile -fipa-reference \
-fmerge-constants \
-fsplit-wide-types \
-ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch \
-ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse \
-ftree-forwprop -ftree-fre -ftree-phiprop -ftree-slsr -ftree-sra \
-ftree-pta -ftree-ter \
-funit-at-a-time
(btw, all of this information is distilled from man gcc)

Resources