How can a compiler be cross platform(hardware)? - gcc

I just realized that binary compilers convert source code to the binary of the destination platform. Kind of obvious... but if a compiler works such way, then how can the same compiler be used for different systems like x86, ARM, MIPS, etc?
Shouldn't they be supposed to "know" the machine-language of the hardware platform to be able to know how to build the binary? Does a compiler(like gcc) knows the machine language of every single platform that is supported?
How is that system possible, and how can a compiler be optimized for that many platforms at the same time?

Yes, they have to "know" the machine language for every single platform they support. This is a required to generate machine code. However, compilation is a multi-step process. Usually, the first steps of the compilation are common to most architectures.
Taken from wikipedia
Structure of a compiler
Compilers bridge source programs in high-level
languages with the underlying hardware.
A compiler requires
determining the correctness of the syntax of programs,
generating correct and efficient object code,
run-time organization, and
formatting output according to assembler and/or linker conventions.
A
compiler consists of three main parts: the frontend, the middle-end,
and the backend.
The front end
checks whether the program is correctly
written in terms of the programming language syntax and semantics.
Here legal and illegal programs are recognized. Errors are reported,
if any, in a useful way. Type checking is also performed by collecting
type information. The frontend then generates an intermediate
representation or IR of the source code for processing by the
middle-end.
The middle end
is where optimization takes place. Typical
transformations for optimization are removal of useless or unreachable
code, discovery and propagation of constant values, relocation of
computation to a less frequently executed place (e.g., out of a loop),
or specialization of computation based on the context. The middle-end
generates another IR for the following backend. Most optimization
efforts are focused on this part.
The back end
is responsible for translating the IR from the middle-end into assembly code. The target
instruction(s) are chosen for each IR instruction. Register allocation
assigns processor registers for the program variables where possible.
The backend utilizes the hardware by figuring out how to keep parallel
execution units busy, filling delay slots, and so on. Although most
algorithms for optimization are in NP, heuristic techniques are
well-developed.
More this article which describes the structure of a compiler and on this one which deals with Cross compilers.

The http://llvm.org/ project will answer all of your questions in this regard :)
In a nutshell, cross HW compilers emit "intermediate representation" of the code , which is HW agnostic and then its being customized via the native tool chain

Yes it is possible, it's called Cross Compiler. Compilers usually first they generate the object code which is not understanable by the current machine but it can be migrated to the destiny machine with another compiler. Next, object code is "compiled" again and linked with external libraries of the target machines.
TL;DR: Yes, the compilers knows the target code, but you can compile in another hardware.
I recommend you to read attached links for information.

Every platform has its own toolchain, toolchain includes gcc,gdb,ld,nm etc.
Let's take specific example of gcc as of now. GCC source code has many layers including architecture dependent and independent part. Architecture dependent part contains procedures to handle architecture specific things like their stack, function calls, floating point operations. We need to cross compile the gcc source code for a specific architecture like for ARM. You can see its steps here for reference:- http://www.ailis.de/~k/archives/19-arm-cross-compiling-howto.html#toolchain.
This architecture dependent part is responsible for handling machine language operations.

Related

Why does GCC compile itself 3 times?

I have compiled GCC from source but I can't seem to fully understand the utility of gcc compiling itself three times.
What benefit does this serve ?
This answer says:
Build new version of GCC with existing C compiler
re-build new version of GCC with the one you just built
(optional) repeat step 2 for verification purposes.
Now my question is that once the first step is complete and the compiler is built why waste time rebuilding it ?
Is it just for verification ? If so, it seems pretty wasteful.
Things get more complicated over here,
The build for this is more complex than for prior packages, because
you’re sending more information into the configure script and the make
targets aren’t standard.
I mean the whole compiler is written in C right, so why not just do everything in one pass ?
What is the use of the 3-phase bootstrap ?
Thanks in advance.
Stage 2. and 3. are a good test for the compiler itself: If it can compile itself (and usually also some libraries like libgcc and libstdc++-v3) then it can chew non-trivial projects.
In stage 2. and 3., you can generate the compiler with different options, for example without optimization (-O0) or with optimization turned on (-O2). As the output / side effects of a program should not depend on the optimization level used, either version of the compiler must produce the same binary for the same source file, even though the two compilers are binary very different. This is yet another (run-time test) for the compiler.
If you prefer non-bootstrap for some reason, configure --disable-bootstrap.
Considering the question from an information theory perspective, the first stage in a three stage compilation of a compiler does not produce a compiler. It produces a hypothesis that requires experimental verification. The sign of a good compiler distribution package is that it will produce, out of the box and without further work for the system administrator or compiler developer, a working compiler of the distribution's version and with the desired features of that version of that brand of compiler.
Making that happen is not simple. Consider the variables in the target environment.
Target operating system brand
Operating system version
Operating system settings
Shell environment variables
Availability of headers for inclusion
Availability of libraries for linking
Settings passed to the build process
Architecture of the target processing unit
Number of processing units
Bus architecture
Other characteristics of the execution model
Mistakes the developers of the compiler might make
Mistakes the person building the compiler might make
In the GNU compiler tool set, and in many tarball distributions, the program "configure" attempts to produce a build configuration that adapts to as many of the permutations of these as is reasonably possible. The completion without error or warning from configure is not a guarantee that the compiler will function. Furthermore, and more importantly for this question, the completion of the build is no guarantee either.
The newly built compiler may function for HelloWorld.c but not for a collection of a thousand source files in a multi-project, multi-repository collection of software called, "Intelligent Interplanetary Control and Acquisition System."
Stage two and three are reasonable attempts at checking at least some of the compiler capabilities, since the compiler source itself is handy and demands quite a bit out of the hypothetically working compiler just built.
It is important to understand that the result of stage one and the result of stage two will not match. Their executables and other built artifacts are results from two different compilers. The stage one result is compiled with whatever the build system found in one of the directories listed in the "PATH" variable to compile C and C++ source code. The stage two result is compiled with the hypothetically working new compiler. The interesting probabilistic consideration is this:
If the result of using stage one's result to compile the compiler again equals exactly the result of using stage two's result to compile the compiler a third time, then both are likely correct for at least the features that the compiler's source code requires.
That last sentence may need to be reread a dozen times. Its actually a simple idea, but the redundancy of the verb compile and noun compiler can tie a knot that takes a few minutes to untie and be able to retie. The source, the target, and the action executed have the same linguistic root, not just once but three times.
The build instructions for the compiler, as of May 25th, 2020, states the converse, which is easier to understand but merely anecdotal, not getting at the cruz of the reason three stages are important.
If the comparison of stage2 and stage3 fails, this normally indicates that the stage2 compiler has compiled GCC incorrectly, and is therefore a potentially serious bug which you should investigate and report.
If we consider C/C++ development from a reliability assessment, test-first, eXtreme Programming, 6-Sigma, or Total Quality Management perspective, what component in a C/C++ development environment has to be more reliable than the compiler? Not many. And even the three stage bootstrapping of a compiler that the GNU compiler package has been using since early days is a reasonable but not an exhaustive test. That's why there are additional tests in the package.
From a continuous integration point of view, the entire body of software under development by those that are about to use the new compiler should be tested before and after a new compiler is compiled and deployed. That's the most convenient way to ensure new compiler didn't break the build.
Between the three reliability check points, most people are satisfied.
Ensuring the compiler compiles itself consistently
Other tests the compiler developers have put into their distribution
The developer or system administrators source code domain is not broken by the upgrade
On a mathematics side note, it is actually impossible to exhaustively test a compiler with the silicon and carbon available on planet earth. The bounds of recursion in C++ language abstractions (among other things) are infinite, so the silicon or time required places testing every permutation of source code cannot realistically exist. On the carbon side, no group of people can free up the requisite time to study the source sufficiently to guaranteed that some finite limit is not imposed in some way by the compiler source."
The three levels of checks, only one of which is the three stage bootstrap process, will likely suffice for most of us.
A further benefit of the three stage compile is that the new compiler is compiled with the new compiler which is presumably better either in terms of speed or resource consumption and possibly both.

ARM softfp vs hardfp performance

I have an ARM based platform with a Linux OS. Even though its gcc-based toolchain supports both hardfp and softfp, the vendor recommends using softfp and the platform is shipped with a set of standard and platform-related libraries which have only softfp version.
I'm making a computation-intensive (NEON) AI code based on OpenCV and tensorflow lite. Following the vendor guide, I have built these with softfp option. However, I have a feeling that my code is underperformed compared to other somewhat alike hardfp platforms.
Does the code performance depend on softfp/hardfp setting? Do I understand it right that all .o and .a files the compiler makes to build my program are also using softfp convention, which is less effective? If it does, are there any tricky ways to use hardfp calling convention internally but softfp for external libraries?
Normally, all objects that are linked together need to have the same float ABI. So if you need to use this softfp only library, i'm afraid you have to compile your own software in softfp too.
I had the same question about mixing ABIs. See here
Regarding the performance: the performance lost with softfp compared to hardfp is that you will pass (floating point) function parameters through usual registers instead of using FPU registers. This requires some additional copy between registers. As old_timer said it is impossible to evaluate the performance lost. If you have a single huge function with many float operations, the performance will be the same. If you have many small function calls with many floating variables and few operations, the performance will be dramatically slower.
The softfp option only affects the parameter passing.
In other words, unless you are passing lots of float type arguments while calling functions, there won't be any measurable performance hit compared to hardfp.
And since well designed projects heavily rely on passing pointer to structures instead of many single values, I would stick to softfp.

Default GCC optimization options for a specific architecture

our compilers course features exercises asking us to compare code built with the -O and -O3 gcc options. The code generated by my machine isn't the same as the code in the course. Is there a way to figure the optimization options used in the course, in order to obtain the same code on my machine, and make more meaningful observations?
I found how to get the optimization options on my machine :
$ gcc -O3 -Q --help=optimizer
But is there a way to deduce those on the machine of the professor except by trying them all and modifying them one by one (.ident "GCC: (Debian 4.3.2-1.1) 4.3.2")?
Thanks for your attention.
Edit:
I noticed that the code generated on my machine lacks the prologue and epilogue generated on my professor's. Is there an option to force prologue generation (google doesn't seem to bring much)?
Here's what you need to know about compiler optimizations : they are architecture dependent. Also, they're mainly different from one version of the compiler to another (gcc-4.9 does more stuff by default than gcc-4.4).
By architecture, I mean CPU micro architecture (Intel : Nehalem, Sandy bridge, Ivy Bridge, Haswell, KNC ... AMD : Bobcat, Bulldozzer, Jaguar, ...). Compilers usually convert input code (C, C++, ADA, ...) into a CPU-agnostic intermediary representation (GIMPLE for GCC) on which a large number of optimizations will be performed. After that, the compiler will generate a lower level representation closer to assembly. On the latter, architecture specific optimizations will be unrolled. Such optimizations include the choice of instructions with the lowest latencies, determining loop unroll factors depending on the loop size, the instruction cache size, and so on.
Since your generated code is different from the one you got in class, I suppose the underlying architectures must be different. In this case, even with the same compiler flags you won't be able to get the same assembly code (even with no optimizations you'll get different assembly codes).
For that, you should concentrate on comparing the optimized and non-optimized codes rather than trying to stick to what you were given in class. I even think that it's a great reverse engineering exercise to compare your optimized code to the one you were given.
You can find one of my earlier posts about compiler optimizations in here.
Two great books on the subject are The Dragon Book (Compilers: Principles, Techniques, and Tools) by Aho, Seti, and Ulman, and also Engineering a Compiler by Keith Cooper, and Linda Torczon.

Does a compiler always produce an assembly code?

From Thinking in C++ - Vol 1:
In the second pass, the code generator walks through the parse tree
and generates either assembly language code or machine code for the
nodes of the tree.
Well at least in GCC if we give the option of generating the assembly code, the compiler obeys by creating a file containing assembly code. But, when we simply run the command gcc without any options does it not produce the assembly code internally?
If yes, then why does it need to first produce an assembly code and then translate it to machine language?
TL:DR different object file formats / easier portability to new Unix platforms (historically) is one of the main reasons for gcc keeping the assembler separate from the compiler, I think. Outside of gcc, the mainstream x86 C and C++ compilers (clang/LLVM, MSVC, ICC) go straight to machine code, with the option of printing asm text if you ask them to.
LLVM and MSVC are / come with complete toolchains, not just compilers. (Also come with assembler and linker). LLVM already has object-file handling as a library function, so it can use that instead of writing out asm text to feed to a separate program.
Smaller projects often choose to leave object-file format details to the assembler. e.g. FreePascal can go straight to an object file on a few of its target platforms, but otherwise only to asm. There are many claims (1, 2, 3, 4) that almost all compilers go through asm text, but that's not true for many of the biggest most-widely-used compilers (except GCC) that have lots of developers working on them.
C compilers tend to either target a single platform only (like a vendor's compiler for a microcontroller) and were written as "the/a C implementation for this platform", or be very large projects like LLVM where including machine code generation isn't a big fraction of the compiler's own code size. Compilers for less widely used languages are more usually portable, but without wanting to write their own machine-code / object-file handling. (Many compilers these days are front-ends for LLVM, so get .o output for free, like rustc, but older compilers didn't have that option.)
Out of all compilers ever, most do go to asm. But if you weight by how often each one is used every day, going straight to a relocatable object file (.o / .obj) is significant fraction of the total builds done on any given day worldwide. i.e. the compiler you care about if you're reading this might well work this way.
Also, compilers like javac that target a portable bytecode format have less reason to use asm; the same output file and bytecode format work across every platform they have to run on.
Related:
https://retrocomputing.stackexchange.com/questions/14927/when-and-why-did-high-level-language-compilers-start-targeting-assembly-language on retrocomputing has some other answers about advantages of keeping as separate.
What is the need to generate ASM code in gcc, g++
What do C and Assembler actually compile to? - even compilers that go straight to machine code don't produce linked executables directly, they produce relocatable object files (.o or .obj). Except for tcc, the Tiny C Compiler, intended for use on the fly for one-file C programs.
Semi-related: Why do we even need assembler when we have compiler? asm is useful for humans to look at machine code, not as a necessary part of C -> machine code.
Why GCC does what it does
Yes, as is a separate program that the gcc front-end actually runs separately from cc1 (the C preprocessor+compiler that produces text asm).
This makes gcc slightly more modular, making the compiler itself a text -> text program.
GCC internally uses some binary data structures for GIMPLE and RTL internal representations, but it doesn't write (text representations of) those IR formats to files unless you use a special option for debugging.
So why stop at assembly? This means GCC doesn't need to know about different object file formats for the same target. For example, different x86-64 OSes use ELF, PE/COFF, MachO64 object files, and historically a.out. as assembles the same text asm into the same machine code surrounded by different object file metadata on different targets. (There are minor differences gcc has to know about, like whether to prepend an _ to symbol names or not, and whether 32-bit absolute addresses can be used, and whether code has to be PIC.)
Any platform-specific quirks can be left to GNU binutils as (aka GAS), or gcc can use the vendor-supplied assembler that comes with a system.
Historically, there were many different Unix systems with different CPUs, or especially the same CPU but different quirks in their object file formats. And more importantly, a fairly compatible set of assembler directives like .globl main, .asciiz "Hello World!\n", and similar. GAS syntax comes from Unix assemblers.
It really was possible in the past to port GCC to a new Unix platform without porting as, just using the assembler that comes with the OS.
Nobody has ever gotten around to integrating an assembler as a library into GCC's cc1 compiler. That's been done for the C preprocessor (which historically was also done in a separate process), but not the assembler.
Most other compilers do produce object files directly from the compiler, without a text asm temporary file / pipe. Often because the compiler was only designed for one or a couple targets, like MSVC or ICC or various compilers that started out as x86-only, or many vendor-supplied compilers for embedded chips.
clang/LLVM was designed much more recently than GCC. It was designed to work as an optimizing JIT back-end, so it needed a built-in assembler to make it fast to generate machine code. To work as an ahead-of-time compiler, adding support for different object-file formats was presumably a minor thing since the internal software architecture was there to go straight to binary machine code.
LLVM of course uses LLVM-IR internally for target-independent optimizations before looking for back-end-specific optimizations, but again it only writes out this format as text if you ask it to.
The assembler stage can be justified by two reasons:
it allows c/c++ code to be translated to a machine independent abstract assembler, from which there exists easy conversions to a multitude of different instruction set architectures
it takes out the burden of validating correct opcode, prefix, r/m, etc. instruction encoding for CISC architectures, when one can utilize an existing software [component].
The 1st edition of that book is from 2000, but is may as well talk about the early 90's, when c++ itself was translated to c and when the gnu/free software idea (including source code for compilers) was not really known.
EDIT: One of several nonsensical abstract machine independent languages used by GCC is RTL -- Register Transfer Language.
It's a matter of compiler implementation. Assembly code is an intermediate step between higher-level language (the one being compiled) and the resulting binary output. In general it's easier first to convert to assembly and after that to binary code instead of directly creating the binary code.
Gcc does create the assembly code as a temporary file, calls the assembler, and maybe the linker depending on what you do or dont add on the command line. That makes an object and then if enabled the binary, then all the temporary files are cleaned up. Use -save-temps to see what is really going on (there are a number of temporary files).
Running gcc without any options absolutely creates an asm file.
There is no "need" for this, it is simply how they happened to design it. I assume for multiple reasons, you will already want/need an assembler and linker before you start on a compiler (cart before the horse, asm on a processor before some other language). "The unix way" is to not re-invent tools or libraries, but just add a little on top, so that would imply going to asm then letting the assembler and linker do the rest. You dont have to re-invent so much of the assemblers job that way (multiple passes, resolving labels, etc). It is easier for a developer to debug ascii asm than bits. Folks have been doing it this way for generations of compilers. Just in time compilers are the primary exception to this habit, by definition they have to be able to go to machine code, so they do or can. Only recently though did llvm provide a way for the command line tools (llc) to go straight to object without stopping at asm (or at least it appears that way to the user).

How does Go compile so quickly?

I've Googled and poked around the Go website, but I can't find an explanation for Go's extraordinary build times. Are they products of the language features (or lack thereof), a highly optimized compiler, or something else? I'm not trying to promote Go; I'm just curious.
Dependency analysis.
The Go FAQ used to contain the following sentence:
Go provides a model for software
construction that makes dependency
analysis easy and avoids much of the
overhead of C-style include files and
libraries.
While the phrase is not in the FAQ anymore, this topic is elaborated upon in the talk Go at Google, which compares the dependency analysis approach of C/C++ and Go.
That is the main reason for fast compilation. And this is by design.
I think it's not that Go compilers are fast, it's that other compilers are slow.
C and C++ compilers have to parse enormous amounts of headers - for example, compiling C++ "hello world" requires compiling 18k lines of code, which is almost half a megabyte of sources!
$ cpp hello.cpp | wc
18364 40513 433334
Java and C# compilers run in a VM, which means that before they can compile anything, the operating system has to load the whole VM, then they have to be JIT-compiled from bytecode to native code, all of which takes some time.
Speed of compilation depends on several factors.
Some languages are designed to be compiled fast. For example, Pascal was designed to be compiled using a single-pass compiler.
Compilers itself can be optimized too. For example, the Turbo Pascal compiler was written in hand-optimized assembler, which, combined with the language design, resulted in a really fast compiler working on 286-class hardware. I think that even now, modern Pascal compilers (e.g. FreePascal) are faster than Go compilers.
There are multiple reasons why the Go compiler is much faster than most C/C++ compilers:
Top reason: Most C/C++ compilers exhibit exceptionally bad designs (from compilation speed perspective). Also, from compilation speed perspective, some parts of the C/C++ ecosystem (such as editors in which programmers are writing their code) aren't designed with speed-of-compilation in mind.
Top reason: Fast compilation speed was a conscious choice in the Go compiler and also in the Go language
The Go compiler has a simpler optimizer than C/C++ compilers
Unlike C++, Go has no templates and no inline functions. This means that Go doesn't need to perform any template or function instantiation.
The Go compiler generates low-level assembly code sooner and the optimizer works on the assembly code, while in a typical C/C++ compiler the optimization passes work on an internal representation of the original source code. The extra overhead in the C/C++ compiler comes from the fact that the internal representation needs to be generated.
Final linking (5l/6l/8l) of a Go program can be slower than linking a C/C++ program, because the Go compiler is going through all of the used assembly code and maybe it is also doing other extra actions that C/C++ linkers aren't doing
Some C/C++ compilers (GCC) generate instructions in text form (to be passed to the assembler), while the Go compiler generates instructions in binary form. Extra work (but not much) needs to be done in order to transform the text into binary.
The Go compiler targets only a small number of CPU architectures, while the GCC compiler targets a large number of CPUs
Compilers which were designed with the goal of high compilation speed, such as Jikes, are fast. On a 2GHz CPU, Jikes can compile 20000+ lines of Java code per second (and the incremental mode of compilation is even more efficient).
Compilation efficiency was a major design goal:
Finally, it is intended to be fast: it should take at most a few seconds to build a large executable on a single computer. To meet these goals required addressing a number of linguistic issues: an expressive but lightweight type system; concurrency and garbage collection; rigid dependency specification; and so on. FAQ
The language FAQ is pretty interesting in regards to specific language features relating to parsing:
Second, the language has been designed to be easy to analyze and can be parsed without a symbol table.
While most of the above is true, there is one very important point that was not really mentionend: Dependency management.
Go only needs to include the packages that you are importing directly (as those already imported what they need). This is in stark contrast to C/C++, where every single file starts including x headers, which include y headers etc. Bottom line: Go's compiling takes linear time w.r.t to the number of imported packages, where C/C++ take exponential time.
A good test for the translation efficiency of a compiler is self-compilation: how long does it take a given compiler to compile itself? For C++ it takes a very long time (hours?). By comparison, a Pascal/Modula-2/Oberon compiler would compile itself in less than one second on a modern machine [1].
Go has been inspired by these languages, but some of the main reasons for this efficiency include:
A clearly defined syntax that is mathematically sound, for efficient scanning and parsing.
A type-safe and statically-compiled language that uses separate compilation with dependency and type checking across module boundaries, to avoid unnecessary re-reading of header files and re-compiling of other modules - as opposed to independent compilation like in C/C++ where no such cross-module checks are performed by the compiler (hence the need to re-read all those header files over and over again, even for a simple one-line "hello world" program).
An efficient compiler implementation (e.g. single-pass, recursive-descent top-down parsing) - which of course is greatly helped by points 1 and 2 above.
These principles have already been known and fully implemented in the 1970s and 1980s in languages like Mesa, Ada, Modula-2/Oberon and several others, and are only now (in the 2010s) finding their way into modern languages like Go (Google), Swift (Apple), C# (Microsoft) and several others.
Let's hope that this will soon be the norm and not the exception. To get there, two things need to happen:
First, software platform providers such as Google, Microsoft and Apple should start by encouraging application developers to use the new compilation methodology, while enabling them to re-use their existing code base. This is what Apple is now trying to do with the Swift programming language, which can co-exist with Objective-C (since it uses the same runtime environment).
Second, the underlying software platforms themselves should eventually be re-written over time using these principles, while simultaneously redesigning the module hierarchy in the process to make them less monolithic. This is of course a mammoth task and may well take the better part of a decade (if they are courageous enough to actually do it - which I am not at all sure in the case of Google).
In any case, it's the platform that drives language adoption, and not the other way around.
References:
[1] http://www.inf.ethz.ch/personal/wirth/ProjectOberon/PO.System.pdf, page 6: "The compiler compiles itself in about 3 seconds". This quote is for a low cost Xilinx Spartan-3 FPGA development board running at a clock frequency of 25 MHz and featuring 1 MByte of main memory. From this one can easily extrapolate to "less than 1 second" for a modern processor running at a clock frequency well above 1 GHz and several GBytes of main memory (i.e. several orders of magnitude more powerful than the Xilinx Spartan-3 FPGA board), even when taking I/O speeds into account. Already back in 1990 when Oberon was run on a 25MHz NS32X32 processor with 2-4 MBytes of main memory, the compiler compiled itself in just a few seconds. The notion of actually waiting for the compiler to finish a compilation cycle was completely unknown to Oberon programmers even back then. For typical programs, it always took more time to remove the finger from the mouse button that triggered the compile command than to wait for the compiler to complete the compilation just triggered. It was truly instant gratification, with near-zero wait times. And the quality of the produced code, even though not always completely on par with the best compilers available back then, was remarkably good for most tasks and quite acceptable in general.
Go was designed to be fast, and it shows.
Dependency Management: no header file, you just need to look at the packages that are directly imported (no need to worry about what they import) thus you have linear dependencies.
Grammar: the grammar of the language is simple, thus easily parsed. Although the number of features is reduced, thus the compiler code itself is tight (few paths).
No overload allowed: you see a symbol, you know which method it refers to.
It's trivially possible to compile Go in parallel because each package can be compiled independently.
Note that Go isn't the only language with such features (modules are the norm in modern languages), but they did it well.
Quoting from the book "The Go Programming Language" by Alan Donovan and Brian Kernighan:
Go compilation is notably faster than most other compiled languages, even when building from scratch. There are three main reasons for the compiler’s speed. First, all imports must be explicitly listed at the beginning of each source file, so the compiler does not have to read and process an entire file to determine its dependencies. Second, the dependencies of a package form a directed acyclic graph, and because there are no cycles, packages can be compiled separately and perhaps in parallel. Finally, the object file for a compiled Go package records export information not just for the package itself, but for its dependencies too. When compiling a package, the compiler must read one object file for each import but need not look beyond these files.
The basic idea of compilation is actually very simple. A recursive-descent parser, in principle, can run at I/O bound speed. Code generation is basically a very simple process. A symbol table and basic type system is not something that requires a lot of computation.
However, it is not hard to slow down a compiler.
If there is a preprocessor phase, with multi-level include directives, macro definitions, and conditional compilation, as useful as those things are, it is not hard to load it down. (For one example, I'm thinking of the Windows and MFC header files.) That is why precompiled headers are necessary.
In terms of optimizing the generated code, there is no limit to how much processing can be added to that phase.
Simply ( in my own words ), because the syntax is very easy ( to analyze and to parse )
For instance, no type inheritance means, not problematic analysis to find out if the new type follows the rules imposed by the base type.
For instance in this code example: "interfaces" the compiler doesn't go and check if the intended type implement the given interface while analyzing that type. Only until it's used ( and IF it is used ) the check is performed.
Other example, the compiler tells you if you're declaring a variable and not using it ( or if you are supposed to hold a return value and you're not )
The following doesn't compile:
package main
func main() {
var a int
a = 0
}
notused.go:3: a declared and not used
This kinds of enforcements and principles make the resulting code safer, and the compiler doesn't have to perform extra validations that the programmer can do.
At large all these details make a language easier to parse which result in fast compilations.
Again, in my own words.
Go imports dependencies once for all files, so the import time doesn't increase exponentially with project size.
Simpler linguistics means interpreting them takes less computing.
What else?

Resources