Why making/building C apps takes very long time? - gcc

Why does making a C/C++ app take very long compared to other apps (Java for example).
I am trying to build Ubuntu Unity, and it takes about 4 minutes on my local machine.
I think the process of Generating object files is the one that take most time.
Any advice?

If you want to speed up code generation you can use ccache. Also you can take a look at gcc version as older versions are known to lag behind. Clang also supersede them a lot.
I'm not touching compilation speed bacause this is a HUGE topic. Starting from that C/C++ is a fully compilable languages, while in Java you never compile to the machine codes, you just generate a bytecode leaving everything else to the VM.

Related

How to build or get a original latest version of GCC for Windows?

I want a latest version of GCC for Windows.
Now the latest version is 9.2 but for Windows via MinGW it is just 8.1...
I have tried to build from source for Windows 10 include WSL, but have not found how to do it exactly, I do not want use via CygWin or other emulater, just real on Windows as clang and MSVC.
Note: I have Windows 10 latest version with WSL.
The latest version of GCC (9.2.0) compiler combined with the latest MinGW-w64 (7.0.0) headers and libraries can be found in the standalone build at http://winlibs.com/
Oh the pain, getting a working GCC for Windows.
build your own?
Building is a fun experience, or a no-fun experience, depending on how you look at it.
I've spent literally weeks of time building GCC successfully and unsuccessfully (native and cross). Follow the instructions to the letter, and it works. And then, another day, it doesn't work (with the slightest different sub-sub-release or revision, or the tiniest little change that is entirely "harmless", or to the best of your knowledge no change at all, and you never get it to build again).
Save everything you've done (copy console), keep the build tree, and repeat the build (paste text) 6 months later after first doing a svn update. Compiles fine for 15-20 minutes, then fails. Start from scratch, and spend a day or two until it works, and you cannot tell why it works now.
Use a build script by someone who offers binary builds (so the assumption is that it must work, otherwise where do the binaries come from). The build script more or less does exactly what you've done by hand anyway, and it works, or maybe doesn't work. If you are only interested in actually having a compiler that works for compiling under Windows, and not in spending your life fiddling around, that's not a lot of fun.
use a pre-built binary?
There exists serval binary distros from a variety of sources.
Although downloading binaries is of course always a tidbid risky (even when scanning everything before you run it, malware scanners are nowhere near perfect, or even good or halfway reliable), compilers are particularly high-risk. That's because compilers are a very interesting target for malware distributors as they get free redistribution with everything you build.
I've actually seen GCC builds with malware built-in on apparently harmless sites (forgot the name, but one such example was a site offering GCC builds for several architectures, which looked very nice).
Now... there exists a distro which supports GCC 9.2 since some time built by someone under the pseudonym "nuwen".
It turns out, that "nuwen" is actually Stephan T. Lavavej, so... chances are that this is a distro that you actually want to use (I'm using it anyway). It's unlikely that you will be able build one yourself that's substantially better (also that one has a lot of useful support libs already coming with it), and it's unlikely that it is harmful.
https://nuwen.net/mingw.html
Note that MSYS2 will also allow you to install a very recent GCC (9.1 or 9.2, not sure) via pacman, very fast and very trouble-free. MSYS2 is nice insofar as you get a 95% working Unix-like working environment with 95% of the tools.
And 95% of the time, it works fine in every practical respect. Until then, one day, it doesn't, usually related to some configure script messing up pathnames, or something with environment variables. Or something else very subtle. For example, it is very much possible to successfully build GCC with MSYS2 (I've done it), and it works "perfectly fine" until some weeks later you discover that something doesn't work in your custom build, so some old project of yours now suddenly doesn't build any more when it did with the old stock compiler.
That's probably issues that one could fix, if determined (I'm however too lazy, for me a compiler is something that simply must work).
There are two well known distributions of the GCC bundle for Windows. The first one is by equation.com
http://www.equation.com/servlet/equation.cmd?fa=fortran
and the second one is by winlibs.com
http://winlibs.com/

What makes DosBox on web.archive.org so slow despite JIT?

I wanted to see if I could play Exile 2 from web.archive.org and I found that I need to install it first, which takes ages. Given that I'm basically emulating x86 machine on an x86 computer, DosBox supports dynarec (dynamic recompilation) and contemporary browsers support JITing the JavaScript code (and Emscripten generates asm.js, which should be rather easy to JIT), what makes it all so slow? In other words, what could be the bottleneck?
Dosbox is compiled using Emterpreter, which makes it slower than the pure asmjs version:
The Emterpreter is an option that compiles asm.js output from Emscripten into a binary bytecode. It also generates an interpreter ("Emscripten interpreter", hence Emterpreter) capable of executing that bytecode. This lets you compile your project, or parts of your project, into bytecode that will be interpreted, as opposed to asm.js that will be executed directly by the JavaScript engine.
The second reason is, that the Dynamic recompilation in the emscripten port of dosbox is not available yet. It would be a lot of work, to make it possible to create asmjs code on the fly.

How compilation and linking at runtime is happening?

In a tutorial I've encountered a new concept (for me), that I never thought is possible. Actually, I thought that compilation is an entirely pre-run-time process. This is the phrase from tutorial: "Compile time occurs before link time (when the output of one or more compiled files are joined together) and runtime (when a program is executed). In some programming languages it may be necessary for some compilation and linking to occur at runtime".
My questions are:
Is pre-run-time compilation and linking processes absolutely different from run-time compilation and linking? If yes, please explain the main differences.
How are code sections that need to be compiled(linked) during run-time marked and where is that information kept? (This may be different from language to language, if possible, please give a specific example).
Thank you very much for your time!
Runtime compilation
The best (most well known) example I'm personally aware of is the just in time compilation used by Java. As you might know Java code is being compiled into bytecode which can be interpreted by the Java Virtual Machine. It's therefore different from let's say C++ which is first fully (preprocessed) compiled (and linked) into an executable which can be ran directly by the OS without any virtual machine.
The Java bytecode is instead interpreted by the VM, which maps them to processor specific instructions. That being said the JVM does JIT, which takes that bytecode and compiles it (during runtime) into machine code. Here we arrive at your second question. Even in Java it can depend on which JVM you are using but basically there are pieces of code called hotspots, the pieces of code that are run frequently and which might be compiled so the application's performance improves. This is done during runtime because the normal compiler does not have (or well might not have) all the necessary data to make a proper judgement which pieces of code are in fact ran frequently. Therefore JIT requires some kind of runtime statistics gathering, which is done parallel to the program execution and is done by the JVM. What kind of statistics are gathered, what can be optimised (compiled in runtime) etc. depends on the implementation (you obviously cannot do everything a normal compiler would do due to memory and time constraints - guess this partly answers the first question? you don't compile everything and usually only a limited set of optimisations are supported in runtime compilation). You can try looking for such info but from my experience usually it's very badly documented and hard to find (at least when it comes to official sources, not presentations/blogs etc.)
Runtime linking
Linker is a different pair of shoes. We cannot use the Java example anymore since it doesn't really have a linker like C or C++ (instead it has a classloader which takes care of loading files and putting it all together).
Usually linking is performed by a linker after the compilation step (static linking), this has pros (no dependencies) and cons (higher memory imprint as we cannot use a shared library, when the library number changes you need to recompile your sources).
Runtime linking (dynamic/late linking) is actually performed by the OS and it's the OS linker's job to first load shared libraries and then attach them to a running process. Furthermore there are also different types of dynamic linking: explicit and implicit. This has the benefit of not having to recompile the source when the version number changes since it's dynamic and library sharing but also drawbacks, what if you have different programs that use the same library but require different versions (look for DLL hell). So yes those two concepts are also quite different.
Again how it's all done, how it's decided what and how should be linked, is OS specific, for instance Microsoft has the dynamic-link library concept.

What is the difference between "binary install" and "compile and install from source"? Which is better?

I want to install a driver for Ros (robot operating system), and I have two options the binary install and the compile and install from source. I would like to know which installation is better, and what are the advantages and disadvantages of each one.
Source: AKA sourcecode, usually in some sort of tarball or zip file. This is RAW programming language code. You need some sort of compiler (javac for java, gcc for c++, etc.) to create the executable that your computer then runs.
Advantages:
You can see what the source code is which means....
You can edit the end result program to behave differently
Depending on what you're doing, when you compile, you could enable certain optimizations that will work on your machine and ONLY your machine (or one EXACTLY like it). For instance, for some sort of gfx rendering software, you could compile it to enable GPU support, which would increase the rendering speed.
You can create a version of an application for a different OS/Chipset (see Binary below)
Disadvantages:
You have to have your compiler installed
You need to manually install all required libraries, which frequently also need to be compiled (and THEIR libraries need to be installed, etc.) This can easily turn a quick 30-second command into a multi-hour project.
There are any number of things that could go wrong, and if you're not familiar with what the various errors mean, finding support online could be quite difficult.
Binary: This is the actual program that runs. This is the executable that gets created when you compile from source. They typically have all necessary libraries built into them, or install/deploy them as necessary (depending on how the application was written).
Advantages:
It's ready-to-run. If you have a binary designed for your processor and operating system, then chances are you can run the program and everything will work the first time.
Less configuration. You don't have to set up a whole bunch of configuration options to use the program; it just uses a generic default configuration.
If something goes wrong, it should be a little easier to find help online, since the binary is pre-compiled....other people may be using it, which means you are using the EXACT same program as them, not one optimized for your system.
Disadvantages:
You can't see/edit the source code, so you can't get optimizations, or tweak it for your specific application. Additionally, you don't really know what the program is going to do, so there could be nasty surprises waiting for you (this is why Antivirus is useful....although LESS necessary on a linux system).
Your system must be compatible with the Binary. For instance, you can't run a 64-bit application on a 32-bit operating system. You can't run an Intel binary for OS X on an older PowerPC-based G5 Mac.
In summary, which one is "better" is up to you. Only you can decide which one will be necessary for whatever it is you're trying to do. In most cases, using the binary is going to be just fine, and give you the least trouble. Sometimes, though, it is nice to have the source available, if only as documentation.

Why is bytecode JIT compiled at execution time and not at installation time?

Compiling a program to bytecode instead of native code enables a certain level of portability, so long a fitting Virtual Machine exists.
But I'm kinda wondering, why delay the compilation? Why not simply compile the byte code when installing an application?
And if that is done, why not adopt it to languages that directly compile to native code? Compile them to an intermediate format, distribute a "JIT" compiler with the installer and compile it on the target machine.
The only thing I can think of is runtime optimization. That's about the only major thing that can't be done at installation time. Thoughts?
Often it is precompiled. Consider, for example, precompiling .NET code with NGEN.
One reason for not precompiling everything would be extensibility. Consider those languages which allow use of reflection to load additional code at runtime.
Some JIT Compilers (Java HotSpot, for example) use type feedback based inlining. They track which types are actually used in the program, and inline function calls based on the assumption that what they saw earlier is what they will see later. In order for this to work, they need to run the program through a number of iterations of its "hot loop" in order to know what types are used.
This optimization is totally unavailable at install time.
The bytecode has been compiled just as well as the C++ code has been compiled.
Also the JIT compiler, i.e. .NET and the Java runtimes are massive and dynamic; And you can't foresee in a program which parts the apps use so you need the entire runtime.
Also one has to realize that a language targeted to a virtual machine has very different design goals than a language targeted to bare metal.
Take C++ vs. Java.
C++ wouldn't work on a VM, In particular a lot of the C++ language design is geared towards RAII.
Java wouldn't work on bare metal for so many reasons. primitive types for one.
EDIT: As delnan points out correctly; JIT and similar technologies, though hugely benificial to bytecode performance, would likely not be available at install time. Also compiling for a VM is very different from compiling to native code.

Resources