I need to compile VASP 5.3.5 on a Cray XC MPP machine. The system has the Gnu, Intel and Cray compiling environments available. There is also a choice of Intel MKL or Cray LibSci for BLAS, LAPACK, ScaLAPCK.
Which is the best compiler to use, the best numerical library to use and the Makefile options to use?
Tests on ARCHER (http://www.archer.ac.uk) have shown that using the Intel compiler with Intel MKL and FFTW produce the best performance and the most stable build of VASP 5.3.5 on the Cray XC30 system.
Full compilation instructions can be found at:
http://www.archer.ac.uk/documentation/software/vasp/compiling_5-3-5-phase2.php
Briefly, the procedure is:
module swap PrgEnv-cray PrgEnv-intel
module load fftw
module load cray-pe-hugepages2M
Modify the library makefile to have the following options:
CPP = gcc -E -P -C $*.F >$*.f
FC=ftn
CFLAGS = -O3
FFLAGS = -O3 -unroll -ip -no-prec-div -xAVX
FREE = -free
Build the library (assuming makefile is called "makefile.cray_xc_intel.lib"):
cd vasp.5.lib
make -f makefile.cray_xc_intel.lib
Move to the main source code directory:
cd ../vasp.5.3
Setup the preprocessor options in the Makefile (this is for the multiple K-points version):
CPP = $(CPP_) -DMPI -DHOST=\"CrayXC-Intel\" \
-DNGZhalf \
-DLONGCHAR \
-Dkind8 \
-DCACHE_SIZE=2000 \
-Davoidalloc \
-DRPROMU_DGEMV \
-DMPI_BLOCK=100000 \
-Duse_collective \
-Drandom_array \
-DscaLAPACK
Set the makefile compilation options:
FC=ftn
FCL=$(FC)
CPP_ = ./preprocess <$*.F | cpp -P -C -traditional >$*$(SUFFIX)
FFLAGS = -free -march=corei7-avx -assume byterecl -m64
OFLAG = -O3 -ip -fno-alias -unroll-aggressive -opt-prefetch -use-intel-optimized-headers -no-prec-div
OFLAG_LOW = -O1 -g -ftz
OBJ_LOW = broydon.o
Set the makefile linear algebra library options for Intel MKL:
MKL_PATH = $(MKLROOT)/lib/intel64
BLAS=
LAPACK=
BLACS=
SCA=
LIB = ../vasp.5.lib/linpack_double.o -L../vasp.5.lib -ldmy \
${MKL_PATH}/libmkl_blas95_lp64.a ${MKL_PATH}/libmkl_lapack95_lp64.a \
${MKL_PATH}/libmkl_scalapack_lp64.a \
-Wl,--start-group ${MKL_PATH}/libmkl_intel_lp64.a \
${MKL_PATH}/libmkl_sequential.a ${MKL_PATH}/libmkl_core.a \
${MKL_PATH}/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm
Finally, set the makefile options for linking FFTW:
FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
Now build the code (assuming makefile is called "makefile.cray_xc_intel"):
make -f makefile.cray_xc_intel
Related
In short
What are cmake's web of makefiles doing differently from a simple compile and link that is making a difference in the final executable?
I'm trying to use the bullet physics library (bullet3-2.83.7) https://github.com/bulletphysics/bullet3.
I compiled the library okay with few warnings in MSYS2 with Mingw-w64.
Afterwards I can run the example programs without problems, specifically ExampleBrowser and HelloWorld.
I've been trying to incorporate the HelloWorld source into a test project using just a Makefile but I get SIGSEGV errors whenever there is a call to dynamicsWorld in the executable. The SEGFAULT occurs at lines dynamicsWorld->AddRigidBody(body); or if those are commented out dynamicsWorld->stepSimulation
This occurs with the exact example source file compiled with the makefile (source not modified).
gdb tells me this
main (argc=1, argv=0x5f4eb0) at main.cpp:78
78 dynamicsWorld->addRigidBody(body);
(gdb) step
0x0000000000002000 in ?? ()
(gdb) step
Cannot find bounds of current function
(gdb) bt full
#0 0x0000000000002000 in ?? ()
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I don't know what to do with this info. I assume/hope I'm just missing a compiler or linker flag?
My original Makefile is a big mess based off http://make.mad-scientist.net/papers/advanced-auto-dependency-generation. I assumed it would be enough to just use the existing makefile on the example code by adding the libraries and include directory -lBulletDynamics_Debug -lBulletCollision_Debug -lLinearMath_Debug
I've also tried a simplified Makefile with commands and flags I found grepping the CMake directories from bullet3/examples/HelloWorld.
My PATH environment variable is clean, nothing in LD_LIBRARY_PATH (In MSYS: echo $PATH)
MSYS2 Mingw-w64
gcc 10.1.0
MSYS 20180531msys64 ? pacman updated a lot of things
CMake 3.17.3
GNU Make 4.3
Makefile
CXX_DEFINES = -DUSE_GRAPHICAL_BENCHMARK -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_SCL_SECURE_NO_WARNINGS
CXX_INCLUDES = -I"C:\lib64\include\bullet3"
CXX_FLAGS = -g -fpermissive -D_DEBUG
.PHONY: all
all:
g++.exe $(CXX_DEFINES) $(CXX_INCLUDES) $(CXX_FLAGS) -o main.obj -c main.cpp
ar cr main.a main.obj
g++.exe $(CXX_FLAGS) -Wl,--whole-archive main.a -Wl,--no-whole-archive -o bulletTest.exe -Wl,--major-image-version,0,--minor-image-version,0 libBulletDynamics_Debug.a libBulletCollision_Debug.a libLinearMath_Debug.a -lkernel32 -luser32 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32
# -L"C:\lib64\lib"
.PHONY: run
run:
gdb -ex run bulletTest.exe -ex "bt full" -ex quit --batch
.PHONY: clean
clean:
rm -f ./bulletTest.exe ./main.obj ./main.a
Building the bullet physics library in MSYS2
In the bullet3-2.83.7 directory (tar.gz from https://github.com/bulletphysics/bullet3/releases)
mkdir build-mingw64
cd build-mingw64
cmake -G "MSYS Makefiles" \
-DBUILD_SHARED_LIBS=0 \
-DBUILD_EXTRAS=1 \
-DINSTALL_LIBS=0 \
-DUSE_GLUT=1 \
-DCMAKE_CXX_FLAGS_DEBUG="-fpermissive -g" \
-DINSTALL_EXTRA_LIBS=0 \
-DCMAKE_BUILD_TYPE=Debug ..
make -j
I ran into the same issue. In my case, it was because Bullet was compiled with USE_DOUBLE_PRECISION, so adding the following to cmakelists for my executable fixed the issue for me:
target_compile_options(<target_name> BEFORE PUBLIC -DBT_USE_DOUBLE_PRECISION)
I am developing a project in ocaml that requires me to interface it with the OGDF external c++ library. It is all up and running on my mac, but now I am trying to create a windows version using Ocaml for Windows (https://fdopen.github.io/opam-repository-mingw/), the MinGW Cygwin port of Ocaml. In this version, I can interface ocaml with c code and it works fine, but as soon as I try to include an external library in that c code I get errors from the linker, which is flexdll (https://github.com/alainfrisch/flexdll) in this case. The linker says it cannot resolve symbols for _Unwind_Resume and __emutls_get_address throughout the library.
Here is a toy example:
My .ml file t.ml:
external print : unit -> unit = "print"
let () =
Printf.printf "platform: %s\n" (Sys.os_type);
print ()
My .cpp file tc.cpp:
#include <stdio.h>
#include "caml/mlvalues.h"
#define CAML_NAME_SPACE
//#include <ogdf/basic/Graph.h>
extern "C" value print(value unused) {
printf("hello from C\n");
return Val_unit;
}
My makefile:
t.exe: t.ml tc.o
ocamlopt -verbose -ccopt -pthread \
-cclib -lstdc++ -w s \
-ccopt -L../cdeg/ogdf/_release \
-cclib -lOGDF \
tc.o t.ml \
-o t.exe
tc.o: tc.cpp
x86_64-w64-mingw32-gcc -c \
-march=x86-64 -mtune=generic -O2 -mms-bitfields -Wall -Wno-unused \
tc.cpp \
-I../cdeg/ogdf -L../cdeg/ogdf/_release -lOGDF \
-I ~/.opam/4.04.0+mingw64c/lib/ocaml \
-lstdc++ -pthread -o tc.o
Like this, it all compiles happily, but if I uncomment the ogdf include line in tc.cpp, I get the following output:
$ make
x86_64-w64-mingw32-gcc -c \
-march=x86-64 -mtune=generic -O2 -mms-bitfields -Wall -Wno-unused \
tc.cpp \
-I../cdeg/ogdf -L../cdeg/ogdf/_release -lOGDF \
-I ~/.opam/4.04.0+mingw64c/lib/ocaml \
-lstdc++ -pthread -o tc.o
ocamlopt -verbose -ccopt -pthread \
-cclib -lstdc++ -w s \
-ccopt -L../cdeg/ogdf/_release \
-cclib -lOGDF \
tc.o t.ml \
-o t.exe
+ x86_64-w64-mingw32-as -o "t.o" "C:\OCaml64\tmp\camlasme5f9bd.s"
+ x86_64-w64-mingw32-as -o "C:\OCaml64\tmp\camlstartupf2b3f1.o" "C:\OCaml64\tmp\camlstartup101e51.s"
+ flexlink -chain mingw64 -stack 33554432 -exe -o "t.exe" "-LC:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml" -pthread -L../cdeg/ogdf/_release "C:\OCaml64\tmp\camlstartupf2b3f1.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\std_exit.o" "t.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\stdlib.a" "-lstdc++" "-lOGDF" "tc.o" "C:/OCaml64/home/Nathaniel.Miller/.opam/4.04.0+mingw64c/lib/ocaml\libasmrun.a" -lws2_32
** Cannot resolve symbols for ../cdeg/ogdf/_release\libOGDF.a(PoolMemoryAllocator.o/
PreprocessorLayout.o/
extended_graph_alg.o/
graph_generators.o/
random_hierarchy.o/
simple_graph_alg.o/
CPlanarEdgeInserter.o/
... [a bunch of other .o files from the library]...
UpwardPlanarModule.o/
UpwardPlanarSubgraphModule.o/
UpwardPlanarSubgraphSimple.o/
VisibilityLayout.o/
):
_Unwind_Resume
__emutls_get_address
** Cannot resolve symbols for ../cdeg/ogdf/_release\libOGDF.a(basic.o):
_Unwind_Resume
File "caml_startup", line 1:
Error: Error during linking
make: *** [makefile:20: t.exe] Error 2
If I don't connect it to ocaml, but instead add a main() function to t.c it compiles just fine under x86_64-w64-mingw32-gcc with the external library included. I've tried including a few other small external libraries and they didn't cause this problem.
My first thought was that maybe the problem had to do with the linked files not all being compiled the same way, but I compiled the library and the .cpp file with the compiler and options given by ocamlopt -configure. And if they weren't all compiled the same way, I wouldn't expect to be able to get tc.cpp to work individually with ocamlopt and with the external library, but I only get errors when I try to use both. So is this an issue with Ocaml for windows, or flexdll, or with my installation of one of these? I'm at a loss for what to try next, and any ideas, suggestions, and/or explanations of what is going on here would be very much appreciated.
I have a partial answer. The issue is coming somehow from flexdll. I switched to using the Cygwin version of ocaml with gcc, and still had the same problem. Then I recompiled ocaml configured with the -no-shared-libs flag, which makes ocamlopt link with gcc instead of flexdll, and now everything compiles.
I installed OCaml via OPAM, and by default it uses gcc as the command to compile .c files. For instance, if I run ocamlopt -verbose file.c, I obtain:
+ gcc -Wall -D_FILE_OFFSET_BITS=64 -D_REENTRANT -g
-fno-omit-frame-pointer -c -I'/home/user/.opam/4.02.1+fp/lib/ocaml' 'test.c'
I'd like to change the GCC binary that is used by OCaml, for instance to replace it with gcc-5.1 or /opt/my-gcc/bin/gcc.
Is it possible to do so without reconfiguring and recompiling OCaml? I suppose I could add a gcc alias to a directory in the PATH, but I'd prefer a cleaner solution if there is one.
To check if gcc was not chosen based on a textual configuration file (that I could easily change), I searched for occurrences of gcc in my /home/user/.opam/4.02.1+fp directory, but the only occurrence in a non-binary file that I found was in lib/ocaml/Makefile.config, and changing it does nothing for the already-compiled binary.
ocamlopt uses gcc for three things. First, for compiling .c files that appear on the command line of ocamlopt. Second, for assembling the .s files that it generates internally when compiling an OCaml source file. Third, for linking the object files together at the end.
For the first and third, you can supply a different compiler with the -cc flag.
For the second, you need to rebuild the OCaml compiler.
Update
Here's what I see on OS X when compiling a C and an OCaml module with the -verbose flag:
$ ocamlopt -verbose -cc gcc -o m m.ml c.c 2>&1 | grep -v warning
+ clang -arch x86_64 -c -o 'm.o' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlasm461f1b.s' \
+ gcc -c -I'/usr/local/lib/ocaml' 'c.c'
+ clang -arch x86_64 -c -o \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartup695941.o' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartupb6b001.s'
+ gcc -o 'm' '-L/usr/local/lib/ocaml' \
'/var/folders/w4/1tgxn_s936b148fdgb8l9xv80000gn/T/camlstartup695941.o' \
'/usr/local/lib/ocaml/std_exit.o' 'm.o' \
'/usr/local/lib/ocaml/stdlib.a' 'c.o' \
'/usr/local/lib/ocaml/libasmrun.a'
So, the compiler given by the -cc option is used to do the compilation of the .c file and the final linking. To change the handling of the .s files you need to rebuild the compiler. I'm going to update my answer above.
I am compiling fortran code on both my macbook pro (10.9.5, 2.3GHz, 8 GB RAM) and on my Dell Latitude E6400 (Windows 7, #.26GHz, 4GB RAM). The code runs in about 12.2 minutes on the mac and 32.6 minutes on the pc. I am using gfortran (4.9.0 on the mac, 4.10.0 on the pc). My make file is:
{
# Makefile for NRMMII
F95 = gfortran
#FINCLUD=
#OBJPATH=./objG95/
# .F90 invokes the fpp preprocessor whereas .f90 does not
# the preprocessor expands source code "macros" and symbolic definitions prior to compilation
# it makes no difference with nrmm30
# compiler flags
#FLAGSC = -ffree-form -fall-intrinsics -fcheck=all -m64 -c
FLAGSC = -std=f95 -v -ffree-form -O0 -fall-intrinsics -fcheck=all -march=native -flto \
-funroll-loops -fno-protect-parens -fstack-arrays -c
# link flags
FLAGSL = -g
#-static Only for windows machines
OBJS = module_nrmmdefs.f90 module_nrmmcntc.f90 module_nrmmdevc.f90 \
module_nrmmdrvc.f90 module_nrmmploc.f90 module_nrmmprdc.f90 \
module_nrmmscnc.f90 module_nrmmstsc.f90 module_nrmmsumc.f90 \
module_nrmmterc.f90 module_nrmmtppc.f90 module_nrmmvehc.f90 \
module_nrmmvers.f90 module_nrmmvppc.f90 module_nrmm_global.f90 \
module_nrmmpbuffs.f90 module_nrmmxmlc.f90 \
main.f90 areal.f90 auxout.f90 crlsno.f90 diag.f90 histogram.f90\
input.f90 miscg95.f90 plow.f90 road.f90 summary.f90 terrain.f90\
vpp.f90 spclxml2.f90 nrmmxml.f90
# special files in nrmmii_misc
# module_spclapfc.f90 spclapf.f90 module_spclresc.f90 spclres.f90 \
# spcljody.f90 spclnode.f90 spclnode1.f90 spclnull.f90 spcltrfs.f90\
# module_nrmmxmlc.f90 nrmmxml.f90 spclxml2.f90 spcltrav.f90
# This defines the compile rules
.SUFFIXES: .o .f90
.f90.o:
$(F95) $(FLAGSC) $*.f90
nrmm30:
$(F95) $(FLAGSL) -o $# $(OBJS)
}
Does anyone have any suggestions as to why the pc version is running so much slower?
I am working on a ARM Cortex A15 and using GCC compile (actually integrating it with TI's SYS/BIOS using XDC tools...)
After I enable -ftlo flag, I am having a performance loss about %30, which is a significant value. I am doing simple benchmark tests like pi and prime number calculating and also system dependent procedural tests.
Below are my compile and link flags. Is this amount of downgrade possible without any errors? Is there a possible cause for this? From what I searched through the internet, I come across benchmarks that flto may not improve the performance but I didn't see such a performance loss...
# Compile options.
C_OPTS = -w\
-mcpu=cortex-a15 \
-mtune=cortex-a15 \
-mabi=aapcs \
-mapcs \
-mfpu=neon \
-mfloat-abi=hard \
-O3 \
-flto \
-fno-strict-aliasing \
-fno-delete-null-pointer-checks \
-fno-strict-overflow \
# Linker options.
L_OPTS = -nostartfiles \
-static \
-Wl,--gc-sections \
-Wl,-Map,$(BUILD_DIR)/$(NAME).map \
-mfloat-abi=hard \
-e wbcd_ep \
-flto \
-fuse-linker-plugin \