GNU Fortran architecture dependent compiler option - performance

Is there a GNU Fortran compiler (v5.3.0) option to tune the code for a particular architecture? I'm especially interested in Intel Core i7. I could not find anything related to code tuning in the official option summary at GNU Fortran 5.3.0 Option Summary. I remember in the past there used to be an option -march=.... Thank you.
Edit:
I have found out the processor architecture with cat /proc/cpuinfo and visited the Intel CPU Specifications website to find out that I have Sandy Bridge CPUs. In my case the correct GNU option would be -march=sandybridge.

i7 is not an architecture, SandyBridge, IvyBridge, Haswell and similar are architectures of Intel CPUs. And all of these architectures can have i3, i5, i7 or Xeon variants sold.
You can have two i7 CPUs, one older and one more recent and they can have different architectures.
In GCC (the whole suite for C, C++, Fortran...) has options -march and -mtune (see https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#x86-Options ) With march the compile code will only run on the specified architecture and newer. With mtune it will run on older, but will be somehow optimized for the specified one.
You can use native and the compiler will use the architecture of your current CPU. Or you can specify some architecture manually, like -march=haswell, -march=ivybridge or -march=core-avx-i.
Be aware you need a recent version of compiler to optimize for new CPU architectures.

All the information you are looking for is in the man page of gcc and not in the man page of gfortran :
man gcc

I assume -march=native does not work?
edit: tried hello world with gcc 5.3, it does compile with the option, don't know though, if it improves things.

Related

What would the optimal march settings be for modern CPUs?

I seen a similar question on this, but it was specific to P4s and Core2s. What I am looking for is a good setting for most modern CPUs, both AMD or Intel. It seems to me that i686 is a little out of date. I am leaning towards Pentium 4's for the extra SSE etc... instruction sets. What is the best target that would be compatible with modern CPUs, both Intel or AMD, for either just -march or both -march and -mtune?
I'm currently using GCC 5.3.0 32 bit on Windows 7.

what is target architecture in computer science?

I am a beginner in programming and wanted to download a good C compiler to practice coding. So I thought of GCC and started a small research on it. I read a Wikipedia article on it. The article mentioned something about target architecture,which I do not know. Can anyone tell me what it means, and any source I can refer for more information. Thanks in advance.
The target architecture is the architecture which the compiler creates binary files for.
Common architectures are: i386 (Intel 32-bit), x86_64 (Intel 64-bit), armv7, arm64, etc...
GCC compiles C code (after the preprocessing stage) to assembly code,
and the assembly code varies depending on the CPU architecture.
The assembly code is then "assembled" to a binary file.
Something to keep in mind:
Two binary files are not guaranteed to be compatible across different operating systems despite sharing the same architecture.
A program compiled on Ubuntu Linux (let's say with arch x86_64) won't work on Windows (with same arc x86_64).
GCC identifies architectures by "triplets", like:
x86_64-apple-darwin14.0.0
i386-pc-mingw32
i686-pc-linux-gnu
Format is:
machine-vendor-operatingsystem (not always followed though)
They contain infos on both the architecture and the operating system.

Can I run C program compiled on different ARM processor?

Let's say I compiled C-program on RaspberryPi, can I run this binary on let's say Cubietruck?
How to know for sure that 2 ARM processors are compatible? Are they all compatible between each other?
It should be some easy answer referring instruction set supported by processors, but I can't find any good materials on that.
There are several conditions for that:
Your executable should use the "least common denominator" of all the ARM microarchitectures you wish to support. See gcc's -march=... option for that. Assuming you're running Linux, grep '^model' /proc/cpuinfo should give you that information for each platform.
(related) Some features may not be supported by all your target ARM cores (FPU, NEON, etc...), so be very careful with that.
You should, of course, run the same OS on all supported platforms.
You need to make sure that all supported platforms run the same ABI; ARM has an history of ABIs changes, so you must take this into consideration.
If you're lucky enough to target only reasonably modern ARM platforms, you should be able to find some common ground (EABI or Hard Float ABI). Otherwise you probably have no choice but to maintain several versions of your executable.

What clang optimization flags should I apply to my OS X 10.5+ universal (i86/x64) binary?

Currently, I am compiling:
clang -Oz -g
But I would like to apply an -mtune and if possible an -march flag, and anything else that will be valid on all Intel architectures that OS X Leopard supports.
Specifically I am asking: which -mtune and -march flag should I specify so that my binary is optimized for 10.5, and will run on all supported Intel processors for 10.5?
In addition, I would like to apply different tunings to the 32 bit and 64 bit portions, is this possible? If so what should I tune the 64 bit portion to?
For bonus points, I am interested in the same for PowerPC, for future reference, though currently I do not support that.
You can build separate binaries using different -march and other flags, though be aware that -march can use instructions not available on earlier processors. -mtune and -mcpu can choose instructions, alignment, etc., that will favour a particular processor, yet run on all processors in that family.
To tune to different architectures (i386, x86-64, ppc, ppc64) you would have to consult the manual pages for clang / gcc. After separate builds it should be possible to use lipo to create a universal binary. There's a simple example here.
For Apple's compilers, you should use the -arch specification, and -mmacosx-version-min=10.5, provided you still have the SDKs for 10.5.

GNU ARM toolchain with hardware floating point support

I have started working on STM32F4 Discovery board and have compiled and run a few basic programs using the latest Yagarto toolchain containing the GCC 4.6.2. Lately though on several forums I have read that many toolchains including the latest Yagarto have problems when it comes to using the on-board hardware FPU. I have also read that the latest CodeSourcery toolchain does support hardware floating point, but not in the lite edition!!
While digging deep into the topic I found this toolchain which is specifically for ARM Cortex M/R controllers, and claims to have no problems when it comes to hardware FPU.
https://launchpad.net/gcc-arm-embedded
I wanted to know from users' experience, if the hardware FPU problems really exist in Yagarto? I am interested in using Yagarto because I also work on ARM7 and yagarto supports that as well. So instead of having different toolchains for different architectures, it is convineant to have one for both ARM7 and Cortex M/R.
If the FPU problems do really exist, then could anyone suggest me a good tried and tested toolchain for both ARM7 and Cortex M/R?
P.S. : I use CodeSourcery's latest GNU Linux toolchain for the BeagleBoard (Cortex A-8), havn't yet faced any issues with it.
I just wrote an article about using ARM's free GCC toolchain (GNU Tools for ARM Embedded Processors) and STLINK on Linux/Ubuntu to write/program/debug code for an STM32F4 Discovery Board (the F4 is a Cortex M4) - that may help you, the compiler does have hardware floating point support and I'm using it in my examples...
http://www.wolinlabs.com/blog/linux.stm32.discovery.gcc.html

Resources