I seen a similar question on this, but it was specific to P4s and Core2s. What I am looking for is a good setting for most modern CPUs, both AMD or Intel. It seems to me that i686 is a little out of date. I am leaning towards Pentium 4's for the extra SSE etc... instruction sets. What is the best target that would be compatible with modern CPUs, both Intel or AMD, for either just -march or both -march and -mtune?
I'm currently using GCC 5.3.0 32 bit on Windows 7.
Related
I was reading a book which describe a historical perspective:
Pentium 4E (2004, 125 M transistors). Added hyperthreading, a method to run two programs simultaneously on a single processor, as well as EM64T, Intel’s implementation of a 64-bit extension to IA32 developed by Advanced Micro Devices (AMD), which we refer to as x86-64
I'm a little bit confused here,here is my two questions:
Q1-does it mean that x86-64 is just an alias name of EM64T?
Q2- And is IA32 developed by AMD? isn't IA32 designed by Intel and first implemented in the 80386 microprocessor in 1985? https://en.wikipedia.org/wiki/IA-32
AMD first named its (original) 64-bit ISA version x86-64. Intel later named its (mostly compatible) version EMT64. See here at Intel:
x64 is a generic name for the 64-bit extensions to Intel's and AMD's 32-bit x86 instruction set architecture (ISA). AMD introduced the first version of x64, initially called x86-64 and later renamed AMD64. Intel named their implementation IA-32e and then EMT64. There are some slight incompatibilities between the two versions, but most code works fine on both versions; details can be found in the Intel® 64 and IA-32 Architectures Software Developer's Manuals and the AMD64 Architecture Tech Docs. We call this intersection flavor x64. Neither is to be confused with the 64-bit Intel® Itanium® architecture, which is called IA-64.
So x64 can be considered standard nowadays.
Relating to your second question: Your assumptions are correct. Intel developed the IA32 ISA and AMD then licensed it with complicated contracts.
Q1. x86-64 is a general name for both Intel's and AMD's implementation. AMD's implementation is also called AMD64, Intel's implementation is also called EMT64.
Q2. Yes. But AMD was the first to make 64-bit implementation of it. Intel's IA64 was different, it was not 64-bit implementation of IA32.
Is there a GNU Fortran compiler (v5.3.0) option to tune the code for a particular architecture? I'm especially interested in Intel Core i7. I could not find anything related to code tuning in the official option summary at GNU Fortran 5.3.0 Option Summary. I remember in the past there used to be an option -march=.... Thank you.
Edit:
I have found out the processor architecture with cat /proc/cpuinfo and visited the Intel CPU Specifications website to find out that I have Sandy Bridge CPUs. In my case the correct GNU option would be -march=sandybridge.
i7 is not an architecture, SandyBridge, IvyBridge, Haswell and similar are architectures of Intel CPUs. And all of these architectures can have i3, i5, i7 or Xeon variants sold.
You can have two i7 CPUs, one older and one more recent and they can have different architectures.
In GCC (the whole suite for C, C++, Fortran...) has options -march and -mtune (see https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#x86-Options ) With march the compile code will only run on the specified architecture and newer. With mtune it will run on older, but will be somehow optimized for the specified one.
You can use native and the compiler will use the architecture of your current CPU. Or you can specify some architecture manually, like -march=haswell, -march=ivybridge or -march=core-avx-i.
Be aware you need a recent version of compiler to optimize for new CPU architectures.
All the information you are looking for is in the man page of gcc and not in the man page of gfortran :
man gcc
I assume -march=native does not work?
edit: tried hello world with gcc 5.3, it does compile with the option, don't know though, if it improves things.
I'm thinking of purchasing a Xeon Phi Knights Corner (KNC) coprocessor card. But I don't own an Intel Compiler and I have no interest in purchasing one (and the non-commercial version no longer seems to be an option).
It appears that GCC is getting OpenMP support for the Xeon Phi. Is there some version of GCC or an extension to GCC that supports the KNC intrinsics?
Note that the 512-bit SIMD of the KNC is not compatible withe AVX512 (though the next version Knights Landing will be).
You will have to use inline assembly rather than intrinsics to use the MIC vector instructions with GCC.
The Intel non-commercial software program was recently rebooted. See https://software.intel.com/en-us/qualify-for-free-software for details.
What is the minimum target processor architecture (indicated with _M_IX86 predefined macros) supported by every version of Visual Studio 2008, 2010 and 2012?
For example, MSVS 2012 supports only Pentium Pro and higher.
The classic switch for this was /G. Your available options differed for different versions of the compiler (with newer versions dropping older options, albeit continuing to accept them for compatibility reasons). Here's what you got:
/G3 built code that was optimized for 386 processors (_M_IX86 was set to 300)
/G4 for the 486 processor (_M_IX86 was set to 400)
/G5 built code that was optimized for the Pentium (_M_IX86 was set to 500)
/G6 built code that was optimized for the Pentium Pro, II, and III (_M_IX86 was set to 600)
/G7 built code that was optimized for the Pentium 4 or AMD Athlon (_M_IX86 was set to 700)
/GB specified either "blend" mode or the lowest common denominator that was reasonable when that version of the compiler was released. This was the default option if no other was specified.
And of course, it bears explicit mention that setting this option to optimize for a newer processor architecture did not prevent your code from running on an older processor architecture. It just wasn't optimized for that architecture and might run more slowly.
However, if you look up this compiler option in a current version of the documentation, you'll see no mention of any of this. All you see is something about Itanium processors (which we'll put aside). That's because the compiler shipping with VC++ 2005 dropped the /G3–/G7 compiler options altogether:
[The] /G3, /G4, /G5, /G6, /G7, and /GB compiler options have been removed. The compiler now uses a "blended model" that attempts to create the best output file for all architectures.
So, although many of us remember it well from VC++ 6, this code generation setting was a historical curiosity only even as far back as VC++ 2008. Therefore I'm not sure where you get the impression that VS 2012 supports only the Pentium Pro. I can't find mention of that anywhere in the official documentation or elsewhere online. The limiting factor for version 2012 of the compiler is not the processor architecture but the OS version. If you've patched the compiler, libraries, and all the other accoutrements to support targeting Windows XP, then you will be able to run your application on an original Pentium-233, onto which you've masochistically shoe-horned Windows XP.
The purpose of the _M_IX86 macro is really just an indicator that you're targeting the Intel IA-32 processor family—more commonly known as good old 32-bit x86—in contrast to one of the other supported target architectures, like _M_AMD64 for 64-bit x86. You should just treat it as a defined/undefined value now.
Yes, the old table of values for _M_IX86 still appears in the latest version of the preprocessor documentation, but it is utterly obsolete. You'll note that other obsolete symbols appear there as well, such as _M_PPC: what was the last version of MSVC++ that shipped with a PowerPC compiler? 4.2?
But that is only part of the story. There are still other compiler options that govern code generation with respect to target architectures.
For example, the /arch switch. From the latest version of the documentation, you have the following options:
/arch:IA32 which essentially sets the lowest common denominator, using x87 for floating point
/arch:SSE which turns on SSE instructions
/arch:SSE2 which turns on SSE2 instructions (and is the default for x86)
/arch:AVX which turns on Intel Advanced Vector Extensions
/arch:AVX2 which turns on Intel Advanced Vector Extensions 2
If you read the Remarks section, you'll also see that these options can imply more than just the specified instruction set. For example, since all processors that support SSE instructions also support the CMOV instruction, the CMOV instruction will be generated when /arch:SSE or higher is specified. The CMOV instruction has nothing to do with SSE; in fact, SSE was introduced with the Pentium III while CMOV was introduced way back with the Pentium Pro. But it's guaranteed to be supported on any architectures that support SSE.
The other relevant option is controlled by the /favor switch. This was new starting with VC++ 2008, and was presumably the replacement for the old /G3–/G7 options. As the documentation says:
/favor:blend is the default and produces code with no unique optimizations
/favor:INTEL64 generates code specific for Intel's implementation of x86-64
/favor:AMD64 generates code specific for AMD's implementation of x86-64
/favor:ATOM generates code specific for Intel's Atom processor
I have been searching for a precompiled library of Lapack for windows, I have found this
but my question is:
Is there any Lapack precompiled version for a quad core machine, Intel preprocessor 32 bits?
I want to get the most efficient computations using this machine, or the only way to go is compiling the libraries in the quad core computer?
My company has used Intel MKL for several years, and we are very satisfied with its performance. It is a commercial product developed by Intel; a single user license costs 399$ (129$ if you are a student).
Another options is AMD ACML. It is available for free, but when we profiled it (five years ago) we found that Intel MKL had better performance.
Both Intel MKL and AMD ACML work with both Intel and AMD processors. If the price is OK use Intel MKL, otherwise go with AMD ACML.