I have what I imagine will be an easy question. How do I create a 64-bit build using ifort? I'm using "ifort -Ofast -o program.exe .f". I've set the compilervars to intel64 and am working on win7 using a xeon processor. I've looked through the menu of compiler flags but haven't been able to identify what I need. I see there's a -m64 option for mac users but that won't help me.
A second question, would there be that big of a performance issue between a gfortran -m64 build relative to the same using ifort?
Thanks!
In ifort, you need to invoke the ifortvars.sh (or .csh) script with the "intel64" argument to get the x64 compiler. In fact, you are required to specify that argument (either as intel64 or ia32), so look to see how it is invoked in your environment and fix the reference. This is not selected with an option to the ifort command.
As for performance comparisons, I would point you at Polyhedron, an independent software reseller in the UK. They do multi-compiler comparisons on fixed hardware. Click on "Compiler Comparisons" in the left column. In their tests, gfortran is in 5th place (ifort is 1st).
Related
I have a 64-bit system, but gcc is 32-bits and when I do
>./gcc -c foobar.c
it makes foobar.o which is 64-bits. OK, but how does it know to do that? Based on what environment setting does it know to produce 64-bit object, and where is that documented??
Come to think about it, it is strange that it does that, is it not?? But file utility clearly says, gcc is 32 bits and foobar.o is 64 bits. ( I moved everything to the same directory so it would not be confused. )
I also checked the 3 dynamically linked libraries that it reads: libc, libm and libz and they are also all 32 bits.
To clarify, I don't want to know, how to make it do 32 bits. I want to know, what is it looking at now that it makes it do 64. That is my question, not how to force it the other way around.
When GCC is configured three different systems can be specified:
build: - the system where GCC is going to be built (probably not
relevant to your question)
host: - the system where GCC is going to
be executed (32 bit in your case)
target: the system where binaries, produced by GCC are going to be executed (64 bit in your case)
You can see how your GCC was configured by running:
gcc -v
command (look for --{build,host,target} options.
I'm interested in answers, approaches, and ideas out of the box. At a high level, the main page is pretty sparse and they mainly list -g, with one level, suggesting that -O0 is also either very helpful or essential.
But I'm wondering what other clang flags can be given to give maximum debugging. Is there an equivalent to gcc's -ggdb3 which includes some of the source or annotations directly in the object output? Or could there be? Is it possible and helpful to recompile the OS and its original libraries to have debugging (and if so, if I'm using Debian, can I have it write the debugging into the main .deb package instead of putting a separate debugged .deb package which stores debugging data in /usr/lib/debug?)? Will a static build of a binary affect the ability to see a good stacktrace? And is there anything that needs to be done to ensure that addr2line works well? Is it needed to compile all libraries (even glibc) with clang to get the maximum debugging benefit? I note that there is a project to recompile Debian with clang, and otherwise am open to a distribution that does so or otherwise places emphasis on debugging.
On Linux there are also options like an LD_PRELOAD set to /lib/libSegFault.so, or a set of LD_LIBRARY_PATH reassignments to /usr/lib/debug instead of the usual /usr/lib location (including redirecting libc itself to the debugged version). Is there a central place or external sources for answers to this question of how to enhance debuggability of a binary? The bigger mystery is clang, since I see in the long gcc man page that there are various options which can increase debugging (or reduce optimisations), but on the other hand the documentation for clang only shows a smaller set. It's possible that clang will accept more options than given, including gcc flags (which may either translate to a no-op or to more debugging - hard to tell without a canonical source of information).
Also from a package build perspective, since an external package may not respect CFLAGS, I've redirected /usr/bin/strip to be a no-op command that always succeeds, but other ideas on ensuring compliance are suggested (I believe that pkgsrc does a good job of wrapping gcc and the linker in shell scripts - useful to insert mandatory flags). Also there may be various ld options that can be passed to increase debugging of the outputted target. Also, it's quite possible that BSD (including FreeBSD 10, based upon clang) may have a different linking architecture which could make it easier to request and find debugged symbols in the generated libraries and executables.
To take debugging more broadly defined, I've set LD_WARN=yes, LD_DEBUG=unused, SEGFAULT_SIGNALS="all", LD_PRELOAD=.../libSegFault.so (as mentioned above), and LD_BIND_NOW=yes. Also I believe I can prefer that gcc search for libraries in /usr/lib/debug - above the standard search paths using strategic -Bs. Also, using --whole-archive for a static build might ensure that more objects are included in the linked output. There's also ulimit -c unlimited, and on Linux a nice way to differentiate core files like:
sysctl -w kernel.core_pattern="core.%t.SIG-%s.PID-%p.ID-%g-%u.%h.%E"
For gcc I've used and seen flags like: -O0 -fno-omit-frame-pointer -fverbose-asm -ggdb3 -mno-omit-leaf-frame-pointer -mtune=generic -fvar-tracking -D_GLIBCXX_DEBUG=1 -frecord-gcc-switches -femit-class-debug-always -fmath-errno -fno-eliminate-unused-debug-symbols -fno-eliminate-unused-debug-types -fno-merge-debug-strings -mieee-fp -mtune=generic -static-libgcc -fexceptions -fvar-tracking -fbounds-check -rdynamic -UNDEBUG -DDEBUG=1 (-ffreestanding -static-libgcc -pass-exit-codes) -fno-stack-check (since I believe I've read that the latter can interfere with debugging)
Other flags are there for other reasons but the emphasis is to be on maximum debugging. With all or most of the above, it's unclear to what extent clang would support or use there, or whether there are other options.
Clang does not support the -ggdb3 flag, only -g, as you have noticed. If you try to use it, you'll get the message:
clang: warning: argument unused during compilation: '-ggdb3'
so you can run your entire command line through Clang and it will tell you which of those GCC flags it supports and which it does not, some will print warnings, others may error out, but Clang will not silently ignore them. Here are the ones that Clang rejected when I tried your long command: -static-libgcc and -pass-exit-codes.
As pointed out in another SO answer, clang -cc1 --help can be used to list supported compilation flags, where we see the following which may be of interest to you:
-disable-llvm-optzns: Don't run LLVM optimization passes
-fno-elide-constructors: Disable C++ copy constructor elision
-mdisable-fp-elim: Disable frame pointer elimination optimization
I was just trying to understand something about cross compilers which made me ask this question.
gcc is a cross-compiler.
By default what it the target architecture for gcc compilation if none is specified is the native target on which I am compiling the source. Correct ?
If the above is correct then how does it manage to generate the code for several different architectures even when it is explicitly specified ?
Shouldn't it have to know all the ISA's ? How is this managed ? Do they have all the information for all the existing ISA's ?
A given (particular) gcc is built for a particular given target. Use gcc -v to find out which.
Often, cross-compilers are installed as different commands, e.g. avr-gcc on Debian for the Atmel AVR processor (with specific options ...)
On some architectures and systems (typically x86 & Linux) you may compile for a different variant. See this. In particular you may want to use -mtune=native or -march=haswell or -m32 ...
If you build gcc yourself from its source tarball, you'll give it at configure time specific configure options (e.g. --program-suffix=-avr and --target=avr for the avr-gcc etc....)
How does one overwrite the default compile flags for Cython when building with distutils?
My question is similar to this , but the response involved manually running the cython steps - given the progress from 0.12 to 01.9 - is it possible for me to simplyy switch from -O to -O3?
Also have users seen a significant difference in speed depending on this switch?
I am on a windows machine.
If you use a setup.py script you can set the "extra_compile_args" option (see https://stackoverflow.com/a/16402557/2355197). Depending on your code, you can see significant differences. For example, on GCC, -O3 enables the option "-finline-functions" which
considers all functions for inlining.
Davide
Is it (easily) possible to use software floating point on i386 linux without incurring the expense of trapping into the kernel on each call? I've tried -msoft-float, but it seems the normal (ubuntu) C libraries don't have a FP library included:
$ gcc -m32 -msoft-float -lm -o test test.c
/tmp/cc8RXn8F.o: In function `main':
test.c:(.text+0x39): undefined reference to `__muldf3'
collect2: ld returned 1 exit status
It is surprising that gcc doesn't support this natively as the code is clearly available in the source within a directory called soft-fp. It's possible to compile that library manually:
$ svn co svn://gcc.gnu.org/svn/gcc/trunk/libgcc/ libgcc
$ cd libgcc/soft-fp/
$ gcc -c -O2 -msoft-float -m32 -I../config/arm/ -I.. *.c
$ ar -crv libsoft-fp.a *.o
There are a few c files which don't compile due to errors but the majority does compile. After copying libsoft-fp.a into the directory with our source files they now compile fine with -msoft-float:
$ gcc -g -m32 -msoft-float test.c -lsoft-fp -L.
A quick inspection using
$ objdump -D --disassembler-options=intel a.out | less
shows that as expected no x87 floating point instructions are called and the code runs considerably slower as well, by a factor of 8 in my example which uses lots of division.
Note: I would've preferred to compile the soft-float library with
$ gcc -c -O2 -msoft-float -m32 -I../config/i386/ -I.. *.c
but that results in loads of error messages like
adddf3.c: In function '__adddf3':
adddf3.c:46: error: unknown register name 'st(1)' in 'asm'
Seems like the i386 version is not well maintained as st(1) points to one of the x87 registers which are obviously not available when using -msoft-float.
Strangely or luckily the arm version compiles fine on an i386 and seems to work just fine.
Unless you want to bootstrap your entire toolchain by hand, you could start with uclibc toolchain (the i386 version, I imagine) -- soft float is (AFAIK) not directly supported for "native" compilation on debian and derivatives, but it can be used via the "embedded" approach of the uclibc toolchain.
GCC does not support this without some extra libraries. From the 386 documentation:
-msoft-float Generate output containing library calls for floating
point. Warning: the requisite
libraries are not part of GCC.
Normally the facilities of the
machine's usual C compiler are used,
but this can't be done directly in
cross-compilation. You must make your
own arrangements to provide suitable
library functions for
cross-compilation.
On machines where a function returns
floating point results in the 80387
register stack, some floating point
opcodes may be emitted even if
-msoft-float is used
Also, you cannot set -mfpmath=unit to "none", it has to be sse, 387 or both.
However, according to this gnu wiki page, there is fp-soft and ieee. There is also SoftFloat.
(For ARM there is -mfloat-abi=softfp, but it does not seem like something similar is available for 386 SX).
It does not seem like tcc supports software floating point numbers either.
Good luck finding a library that works for you.
G'day,
Unless you're targetting a platform that doesn't have inbuilt FP support, I can't think of a reason why you'd want to emulate FP support.
Doesn't your x386 platform have external FPU support? Pity it's not a x486 with the FPU built in!
In my experience, any soft emulation is bound to be much slower than its hardware equivalent.
That's why I finished up writing a package in Ada to taget the onboard 68k FPU instead of using the soft emulation provided by the compiler manufacturer at the time. They finished up bundling it in their compiler as a matter of fact.
Edit: Just seen your comment below. Hmmm, if you don't need a full suite of FP support is it possible to roll your own for the few math functions you do need? That how the Ada package I mentioned got started.
HTH
cheers,