Compiler output on different OS Versions - gcc

As far as I understand(Correct me if i'm wrong), the output of a compiler depends on the Architecture version used, Compiler, and operating system.
Lets say Im using ubuntu release 16.04 x84-64 and compiling a c file with gcc version 5.4(or any other mix of OS,arch,compiler for the example) .
As I understood it until now, if I were to compile the same c file but with a different ubuntu release, with the same arch and compiler version it should have produced the same assembly code.
After a few tries I have got the impression that this is incorrect, how is this possible?
Does the output of a compiler depend on the release of the specific OS?
One of the examples is compiling https://github.com/tbuktu/libntru on 2 different ubuntu versions and receiving different assembly.

The different OS's may have different versions of the default libraries installed (which get linked into your final application). Thus the end result may be slightly different.

If you are just doing a few ubuntu versions the odds of differences goes down as the overall architecture differences may not be reflected either in your test or may not change on the same os family with the same compiler family for long periods of time. Where you are more likely to see differences in a test like that is as you get older versions of the same distro, newer/newest versions of the compiler are not ported/supported directly as an apt-get. maybe you can get them to work by hand building but gcc in particular is really bad about that their code only builds with relatively recent prior or following versions get too far apart and gcc cant build gcc. What I would first expect to see is strictly due to gcc version differences you start to see differences in the compiler.
A better test is take a simple .c file and build for windows any version (using the same version of gcc built for that system) and ubuntu/linux any version. Should more quickly see differences.
Two different compilers should show differences for reasonably sized projects, or knowledge based targeted small code samples, llvm/clang vs gcc for example. Different versions of the same compiler or compiler family will somewhat by definition show differences over time, does 6.x vs 6.x+1 gcc show differences well yes if you know where to look but often not, but gcc 3.x vs gcc 7.x should and then depending on the test you can narrow in from there.
You have compiler to compiler differences on the same os and system that are expected to show differences.
You have various reasons why system to system differences with the same compiler will show differences.
And then combinations of the above would naturally also show differences.
The bigger question is why do you care, the educational information is that you shouldn't expect the same C source code to build the same way if you change the compiler, compiler settings, or operating system. It can have anywhere from no differences to huge differences based on any of the above. Starting quite simply with optimization and other tuning settings and going from there.

Related

why must gnu binutils be configured for a spefic target. What's going on underneath

I am messing around with creating my own custom gcc toolchain for an arm Cortex-A5 cpu, and I am trying to dive as deeply as possible into each step. I have deliberately avoided using crosstool-ng or other tools to assist, in order to get a better understanding of what is going on in the process of creating a toolchain.
One thing stumples me though. During configuration and building of binutils, I need to specify a target (--target). This target can be either the classic host-tuple (ex: arm-none-linux-gnuabi) or a specific type, something like i686-elf for example.
Why is this target needed? and what does it specifically do with the generated "as" and "ld" programs built by binutils?
For example, if I build it with arm-none-linux-gnueabi, it looks like the resulting "as" program supports every arm instruction set under the sun (armv4t, armv5, e.t.c.).
Is it simply for saving space in the resulting executable? Or is something more going on?
I would get it, if I configured the binutils for a specific instruction set for example. Build me an assembler that understands armv4t instructions.
Looking through the source of binutils and gas specifically, it looks like the host-tuple is selecting some header files located in gas/config/tc*, gas/config/te*. Again, this seems arbitrary, as it is broad categories of systems.
Sorry for the rambling :) I guess my questing can be stated as: Why is binutils not an all-in-one package?
Why is this target needed?
Because there are (many) different target architectures. ARM assembler / code is different from PowerPC is different from x86 etc. In principle it would have been possible to design one tool for all targets, but that was not the approach taken at te time.
Focus was mainly on speed / performance. The executables are small as of today's 'standards' but combining all > 40 architectures and all tools like as, ld, nm etc. would be / have been quite clunky.
Moreover, not only are modern host machines much more powerful, that also applies to the compiler / assembled programs, sometime zillions of lines of (preprocessed) C++ to compile. This means the overall build times shifted much more in the direction of compilation that in the old days.
Only different core families are usually switchable / selectable per options.

Distro provided cross compiler vs custom built gcc

I intend to cross compile for Raspberry Pi, basically a small ARM computer. The host will be an i686 box running Arch Linux.
My first instinct is to use cross compiler provided by Arch Linux, arm-elf-gcc-base and arm-elf-binutils. However, every wiki and post I read seems to use some version of custom gcc build. They seem to spend significant time on cooking their own gcc. Problem is that they never say WHY it is important to use their gcc over another.
Can stock distro provided cross compilers be used for building Raspberry Pi or ARM in general kernels and apps?
Is it necessary to have multiple compilers for ARM architecture? If so, why, since single gcc can support all x86 variants?
If 2), then how can I deduce what target subset is supported by a particular version of gcc?
More general question, what general use cases call for custom gcc build?
Please be as technical as you can, I'd like to know WHY as well as how.
When developers talk about building software (cross compiling) for a different machine (target) compared to their own (host) they use the term toolchain to describe the set of tools necessary to build binary files. That's because when you need to build an executable binary, you need more than a compiler.
You need routines (crt0.o) to initialize runtime according to requirements of operating system and standard libraries. You need standard set of libraries and those libraries need to be aware of the kernel on target because of the system calls API and several os level configurations (f.e. page size) and data structures (f.e. time structures).
On the hardware side, there are different set of ARM architectures. Architectures can be backward compatible but a toolchain by nature is binary and targeted for a specific architecture. You can have the most widespread architecture by default but then that won't be too fruitful for an already constraint environment (embedded device). If you have the latest architecture, then it won't be useful for older architecture based targets.
When you build a binary on your host for your host, compiler can look up all the necessary bits from its own environment or use what's on the host - so most of the above details are invisible to developer. However when you build for a different target than your host type, toolchain must know about hardware, os and standard library details. The way you tell these to toolchain is... by building it according to those details which might require some level of bootstrapping. (or you can do this via extensive set of parameters if toolchain supports / built for it.)
So when there is a generic (stock) cross compile toolchain, it has already some target specifics set and that might not meet your requirements. Please see this recent question about the situation on Ubuntu for an example.

GCC: disguising between GCC versions

This question was emerged from this question.
The problem is that there is a NVidia driver for Linux, compiled wth GCC 4.5. The kernel is compiled with GCC 4.6. Well, the stuff doesn't work because of the version number difference between GCCs. (the installer says the driver won't work - for details please visit the link above)
Could one disguise a binary compiled with GCC 4.5 to a binary compiled with GCC 4.6? If it is possible, under what circumstances would it work well?
Your problem is called ABI: Application Binary Interface. This is a set of rules (among others) how functions in a piece of code get their arguments (ordering, padding of types on the stack), naming of the function so the linker can resolve symbols and padding/alignment of fields in structures.
GCC tries to keep the ABI stable between compiler versions but that's not always possible.
For example, GCC 4.4 fixed a bug in packed bit-fields which means that old/new code can't read structures using this feature properly anymore. If you would mix versions before and after 4.4, data corruption would occur without any crashes.
There is no indication in the 4.6 release notes that the ABI was changed but that's something which the Linux kernel can't know - it just reads the compiler version used to compile the code and if the first two numbers change, it assumes that running the code isn't safe.
There are two solutions:
You can compile the Nvidia driver with the same compiler as the kernel. This is strongly recommended
You can patch the version string in the binary. This will trick the kernel into loading the module but at the risk of causing data corruption to internal data structures.

GCC target specificity and binary compatibility

Initial note: The question mentions AIX because it is the initial context but the question really pertains to gcc itself, most probably regardless of the platform.
AIX is supposed to be backwards binary compatible: a C program compiled on AIX 5.1 will run as is on 5.2, 5.3, 6.1 and 7.1.
In my understanding gcc should be built to target a specific system (whether the current one or another one in the case of cross-compiling). So, gcc built on AIX 6.1 targets AIX 6.1, and produces binaries usable on both 6.1 and 7.1 thanks to binary compatibility.
Yet gcc itself built on AIX 6.1 is a 6.1 program, so it should execute on 7.1 as is. Of course if I compile a program with it on 7.1, this program might get linked or use headers specific to 7.1, thus making the resulting binary requiring 7.1. So as far as I understand it, I should be able to run gcc built on AIX 6.1 onto a 7.1 machine, and produce maybe non-optimal yet perfectly valid binaries, although they would require 7.1 as a side effect of linking.
This looks too much like rainbows and unicorns dancing in glittery skies. I smell something fishy but lack any knowledge of gcc innards. Please mighty crowd, enlighten me.
tl;dr: Can gcc built on and targeting a version N of an OS/platform be run and used on version N+1 by virtue of platform binary compatibility to produce binaries running on version N+1? If not, what mechanism would prevent it?
Here's enlightenment: your question is way too general. In order to answer it, someone would have to have knowledge of
the operating systems you care about
the OS versions you care about
the gcc versions you care about
and then research the binary compatibility in this three dimensional matrix.
Mechanisms preventing binary compatibility are too numerous and directly correlate to your OS and compiler vendor's ingenuity at breaking it. One of the more common and documented ways being official deprecation of API calls, removal of compatibility libraries shipped, and bridges being burnt, like going from a.out to ELF.

How can I compile object code for the wrong system and cross compiling question?

Reference this question about compiling. I don't understand how my program for Mac can use the right -arch, compile with those -arch flags, the -arch flags be for the system I am on (a ppc64 g5), and still produce the wrong object code.
Also, if I used a cross compiler and was on Linux, produced 10.5 code for mac, how would this be any different than what I described above?
Background is that I have tried to compile various apache modules. They compile with the -arch ppc, ppc64, etc. I get no errors and I get my mod_whatever.so. But, apache will always complain that some symbol isn't found. Apparently, it has to do with what the compiler produces, even though the file type says it is for ppc, ppc64, i386, x_64 (universal binary) and seems to match all the other .so mods I have.
I guess I don't understand how it could compile for my system with no problem and then say my system can't use it. Maybe I do not understand what a compiler is actually giving me.
EDIT: All error messages and the complete process can be seen here.
Thank you.
Looking at the other thread and elsewhere and without a G5 or OSX Server installation, I can only make a few comments and suggestions but perhaps they will help.
It's generally not a good idea to be modifying the o/s vendor's installed software. Installing a new Apache module is less problematic than, say, overwriting an existing library but you're still at the mercy of the vendor in that a Software Update could delete your modifications and, beyond that you have to figure out how the vendor's version was built in the first place. A common practice in the OS X world is to avoid this by making a completely separate installation of an open source product, like Apache, using, for instance, MacPorts. That has its cons, too: to achieve a high-level of independence, MacPorts will often download and build a lot of dependent packages for things which are already in OS X but there's no harm in that other than some extra build cycles and disk space.
That said, it should be possible to build and install apache modules to supplement those supplied by Apple. Apple does publish the changes it makes to open source products here; you can drill down in the various versions there to find the apache directory which contains the source, Makefile and applied patches. That might be of help.
Make sure that the mod_*.so you build are truly 64-bit and don't depend on any non-64 bit libraries. Use otool -L mod_*.so to see the dynamic libraries that each references and then use file on those libraries to ensure they all have ppc64 variants.
Make sure you are using up-to-date developer tools (Xcode 3.1.3 is current).
While the developer tool chain uses many open source components, Apple has enhanced many of them and there are big differences in OS X's ABIs, universal binary support, dynamic libraries, etc. The bottom line is that cross-compilation of OS X-targeted object code on Linux (or any other non-OS X platform) is neither supported nor practical.

Resources