Reason for huge size of compiled executable of Go - go

I complied a hello world Go program which generated native executable on my linux machine. But I was surprised to see the size of the simple Hello world Go program, it was 1.9MB !
Why is it that the executable of such a simple program in Go is so huge?

This exact question appears in the official FAQ: Why is my trivial program such a large binary?
Quoting the answer:
The linkers in the gc tool chain (5l, 6l, and 8l) do static linking. All Go binaries therefore include the Go run-time, along with the run-time type information necessary to support dynamic type checks, reflection, and even panic-time stack traces.
A simple C "hello, world" program compiled and linked statically using gcc on Linux is around 750 kB, including an implementation of printf. An equivalent Go program using fmt.Printf is around 1.9 MB, but that includes more powerful run-time support and type information.
So the native executable of your Hello World is 1.9 MB because it contains a runtime which provides garbage collection, reflection and many other features (which your program might not really use, but it's there). And the implementation of the fmt package which you used to print the "Hello World" text (plus its dependencies).
Now try the following: add another fmt.Println("Hello World! Again") line to your program and compile it again. The result will not be 2x 1.9MB, but still just 1.9 MB! Yes, because all the used libraries (fmt and its dependencies) and the runtime are already added to the executable (and so just a few more bytes will be added to print the 2nd text which you just added).

Consider the following program:
package main
import "fmt"
func main() {
fmt.Println("Hello World!")
}
If I build this on my Linux AMD64 machine (Go 1.9), like this:
$ go build
$ ls -la helloworld
-rwxr-xr-x 1 janf group 2029206 Sep 11 16:58 helloworld
I get a a binary that is about 2 Mb in size.
The reason for this (which has been explained in other answers) is that we are using the "fmt" package which is quite large, but the binary has also not been stripped and this means that the symbol table is still there. If we instead instruct the compiler to strip the binary, it will become much smaller:
$ go build -ldflags "-s -w"
$ ls -la helloworld
-rwxr-xr-x 1 janf group 1323616 Sep 11 17:01 helloworld
However, if we rewrite the program to use the builtin function print, instead of fmt.Println, like this:
package main
func main() {
print("Hello World!\n")
}
And then compile it:
$ go build -ldflags "-s -w"
$ ls -la helloworld
-rwxr-xr-x 1 janf group 714176 Sep 11 17:06 helloworld
We end up with an even smaller binary. This is as small as we can get it without resorting to tricks like UPX-packing, so the overhead of the Go-runtime is roughly 700 Kb.

Note that the binary size issue is tracked by issue 6853 in the golang/go project.
For instance, commit a26c01a (for Go 1.4) cut hello world by 70kB:
because we don't write those names into the symbol table.
Considering the compiler, assembler, linker, and runtime for 1.5 will be
entirely in Go, you can expect further optimization.
Update 2016 Go 1.7: this has been optimized: see "Smaller Go 1.7 binaries".
But these day (April 2019), what takes the most place is runtime.pclntab.
See "Why are my Go executable files so large? Size visualization of Go executables using D3" from Raphael ‘kena’ Poss.
It is not too well documented however this comment from the Go source code suggests its purpose:
// A LineTable is a data structure mapping program counters to line numbers.
The purpose of this data structure is to enable the Go runtime system to produce descriptive stack traces upon a crash or upon internal requests via the runtime.GetStack API.
So it seems useful. But why is it so large?
The URL https://golang.org/s/go12symtab hidden in the aforelinked source file redirects to a document that explains what happened between Go 1.0 and 1.2. To paraphrase:
prior to 1.2, the Go linker was emitting a compressed line table, and the program would decompress it upon initialization at run-time.
in Go 1.2, a decision was made to pre-expand the line table in the executable file into its final format suitable for direct use at run-time, without an additional decompression step.
In other words, the Go team decided to make executable files larger to save up on initialization time.
Also, looking at the data structure, it appears that its overall size in compiled binaries is super-linear in the number of functions in the program, in addition to how large each function is.

Related

Easy to read Golang assembly output?

I'm interested in examining the x86 assembly output of the standard Go compiler to see if my code is really being converted into reasonably efficient assembly code; hopefully, by profiling and examining the assembly output, I could get a clue as to where/how I should rewrite my Go code for maximum performance. But when I examine the code using the -S flag, Go spits out a mess! I'd like two things:
Is there a way to make the Go compiler dump the assembly output into a file, not just print it out on Terminal?
Also, is there a way to make the Go compiler separate out the assembly code into separate functions, with labels? I know some functions may be inlined and hence not appear in the assembly code. What I'm seeing know is just a homogenous blob of assembly which is almost impossible to understand.
You can redirect the output to a file like this:
go tool compile -S file.go > file.s
You can disable the optimization with -N:
go tool compile -S -N file.go
Alternatively, you can use gccgo:
gccgo -S -O0 -masm=intel test.go
which will generate test.s. You can play with the -O0/1/2/3 to see the different optimizations.
I don't recommend using the output of -S as the Go linker can change what gets written to the object code quite a lot. It does give you some idea as to what is going on.
The go assembler output is rather non-standard too.
When I want to do this I always use objdump which will give you a nice standard assembler output.
Eg for x86 / amd64
objdump -d executable > disassembly
And for ARM (to get the register names to be the same as Go uses)
objdump -M reg-names-raw -d executable > disassembly
Run go tool objdump on the resulting executable file.
To restrict the output to interesting functions, use its -s option.
To dump the output to file:
go tool objdump EXECUTABLE_FILE > ASSEMBLY_FILE
If you want to include the source Go code (assuming you have a working golang setup, and you built the executable yourself):
go tool objdump -S EXECUTABLE_FILE
To make the output even easier to look at I use a small hacky wrapper that produces the following (in a nutshell, it colorizes instructions that alter the control flow -blue for jumps, green for call/return, red for traps, violet for padding- and adds new lines after unconditional control flow jumps):
If you use the wrapper above you will likely want to use the -R switch when piping to less (or by adding it to the environment, e.g. in .bashrc: export LESS="$LESS -R"):
go-objdump EXECUTABLE_FILE | less -R
Alternatively, there is godbolt.org that has probably the most readable output and allows you to switch between compilers (gc, gccgo) and versions very easily.
I had problems with the other answers as the assembly produced provided much more information than I wanted and still not enough details. Let me explain: it provided the assembly for all libraries imported by go internally and did not provide the lines of where my code was (my code was all at the bottom of the file)
Here is what I found from the official docs:
$ GOOS=linux GOARCH=amd64 go tool compile -S x.go # or: go build -gcflags -S x.go
File:
package main
func main() {
println(3)
}
Produces:
--- prog list "main" ---
0000 (x.go:3) TEXT main+0(SB),$8-0
0001 (x.go:3) FUNCDATA $0,gcargs·0+0(SB)
0002 (x.go:3) FUNCDATA $1,gclocals·0+0(SB)
0003 (x.go:4) MOVQ $3,(SP)
0004 (x.go:4) PCDATA $0,$8
0005 (x.go:4) CALL ,runtime.printint+0(SB)
0006 (x.go:4) PCDATA $0,$-1
0007 (x.go:4) PCDATA $0,$0
0008 (x.go:4) CALL ,runtime.printnl+0(SB)
0009 (x.go:4) PCDATA $0,$-1
0010 (x.go:5) RET ,
So what I did was basically:
go tool compile -S hello.go > hello.s
and it got the result I wanted!
A recent alternative would be loov/lensm, which can view assembly and source.
(From Egon Elbre)
To run the program, provide a regular expression filter for the symbol you want to inspect.
-watch allows to automatically reload the executable and information when it changes.
lensm -watch -filter Fibonacci lensm
Note: The program requires a binary that is built on your computer, otherwise the source code for the functions cannot be loaded.
Result:
That could be a nice addition to godbolt.org
The easiest way i found around for Mac users with XCode dev tools is with otool
$ otool -tV <executable>
Source

Is there a method/function to get the code size of a C program compiled using GCC compiler? (may vary when some optimization technique is applied)

Can I measure the code size with the help of an fseek() function and store it to a shell variable?
Is it possible to extract the code size, compilation time and execution time using milepost gcc or a GNU Profiler tool? If yes, how to store them into shell variables?
Since my aim is to find the best set of optimization technique upon the basis of the compilation time, execution time and code size, I will be expecting some function that can return these parameters.
MyPgm=/root/Project/Programs/test.c
gcc -Wall -o1 -fauto-inc-dec $MyPgm -o output
time -f "%e" -o Output.log ./output
while read line;
do
echo -e "$line";
Val=$line
done<Output.log
This will store the execution time to the variable Val. Similarly, I want to get the values of code size as well as compilation time.
I will prefer something that I can do to accomplish this, without using an external program!
for code size on linux, you can use size command on terminal.
$size file-name.out
it will give size of different sections. use text section for code size. you can use data and bss if you want to consider global data size as well.
You can use the size(1) command http://www.linuxmanpages.com/man1/size.1.php
Or open the ELF file, walk over section headers and sum the sizes of all the section with type SHT_PROGBITS and the SHF_EXECINSTR flag set.
On non-Linux / non-GNU-utils systems (where you may have neither GNU size nor readelf), the nm program can be used to dump symbol information (including sizes) from object files (libraries / executables). The syntax is slightly system-dependent:
OpenGroup manpage for nm (the "portable subset")
Linux/BSD manpage for nm (GNU version)
Solaris manpage for nm
AIX manpage for nm
nm usage on HP/UX (this says "PA-RISC" but the utility is present / usable on Itanium)
Windows: Doesn't have nm as such, but see: Microsoft equivalent of the nm command
Unfortunately, while the utility is available almost everywhere, its output format is not as portable as could be, so some system-specific scripting is necessary.

How to force gcc to link like g++?

In this episode of "let's be stupid", we have the following problem: a C++ library has been wrapped with a layer of code that exports its functionality in a way that allows it to be called from C. This results in a separate library that must be linked (along with the original C++ library and some object files specific to the program) into a C program to produce the desired result.
The tricky part is that this is being done in the context of a rigid build system that was built in-house and consists of literally dozens of include makefiles. This system has a separate step for the linking of libraries and object files into the final executable but it insists on using gcc for this step instead of g++ because the program source files all have a .c extension, so the result is a profusion of undefined symbols. If the command line is manually pasted at a prompt and g++ is substituted for gcc, then everything works fine.
There is a well-known (to this build system) make variable that allows flags to be passed to the linking step, and it would be nice if there were some incantation that could be added to this variable that would force gcc to act like g++ (since both are just driver programs).
I have spent quality time with the gcc documentation searching for something that would do this but haven't found anything that looks right, does anybody have suggestions?
Considering such a terrible build system write a wrapper around gcc that exec's gcc or g++ dependent upon the arguments. Replace /usr/bin/gcc with this script, or modify your PATH to use this script in preference to the real binary.
#!/bin/sh
if [ "$1" == "wibble wobble" ]
then
exec /usr/bin/gcc-4.5 $*
else
exec /usr/bin/g++-4.5 $*
fi
The problem is that C linkage produces object files with C name mangling, and that C++ linkage produces object files with C++ name mangling.
Your best bet is to use
extern "C"
before declarations in your C++ builds, and no prefix on your C builds.
You can detect C++ using
#if __cplusplus
Many thanks to bmargulies for his comment on the original question. By comparing the output of running the link line with both gcc and g++ using the -v option and doing a bit of experimenting, I was able to determine that "-lstdc++" was the magic ingredient to add to my linking flags (in the appropriate order relative to other libraries) in order to avoid the problem of undefined symbols.
For those of you who wish to play "let's be stupid" at home, I should note that I have avoided any use of static initialization in the C++ code (as is generally wise), so I wasn't forced to compile the translation unit containing the main() function with g++ as indicated in item 32.1 of FAQ-Lite (http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html).

Can I prevent debugger from stepping into Boost or STL header files?

I'm using Qt Creator with gdb to debug my C++ code on a Linux Platform. Whenever I use a boost::shared_ptr or the like, the debugger steps into the header files containing the boost implementation (i.e. /usr/include/boost/shared_ptr.hpp). I would like to ignore these files in terms of debugging and simply step over them. I know that I can step out as soon as it reaches one of these files, but it would be much easier to debug without doing so several times per debugging session.
I'm using the gcc compiler (g++), running on OpenSuSE Linux 11.2 with QtCreator 2.2 (which uses gdb as the debugger.)
Edit to add: The question is geared toward Boost files, but could also apply toward STL files as well.
GDB without stepping into STL and all other libraries in /usr:
Put the following in your .gdbinit file. It searches through the sources that gdb has loaded or will potentially load (gdb command info sources), and skips them when their absolute path starts with "/usr". It's hooked to the run command, because symbols might get reloaded when executing it.
# skip all STL source files
define skipstl
python
# get all sources loadable by gdb
def GetSources():
sources = []
for line in gdb.execute('info sources',to_string=True).splitlines():
if line.startswith("/"):
sources += [source.strip() for source in line.split(",")]
return sources
# skip files of which the (absolute) path begins with 'dir'
def SkipDir(dir):
sources = GetSources()
for source in sources:
if source.startswith(dir):
gdb.execute('skip file %s' % source, to_string=True)
# apply only for c++
if 'c++' in gdb.execute('show language', to_string=True):
SkipDir("/usr")
end
end
define hookpost-run
skipstl
end
To check the list of files to be skipped, set a breakpoint somewhere (e.g., break main) and run gdb (e.g., run), then check with info sources upon reaching the breakpoint:
(gdb) info skip
Num Type Enb What
1 file y /usr/include/c++/5/bits/unordered_map.h
2 file y /usr/include/c++/5/bits/stl_set.h
3 file y /usr/include/c++/5/bits/stl_map.h
4 file y /usr/include/c++/5/bits/stl_vector.h
...
Its easy to extend this to skip other directories as well by adding a call to SkipDir(<some/absolute/path>).
gdb is scriptable. it has while, if, variables, shell subcommands, user-defined functions (define) etc etc. it has python interface for scriptability.
With a bit of work, you can to make gdb script along these lines:
define step-bypass-boost
step
while 1
use "info source", put current source file into variable
if source file does not match */boost/* then
break-loop
end
step
end
end
or find whether somebody already made such script
Instead of doing s (step), you can
b on first line of your function where you want to stop (b Class::method, or b file.cpp:line),
then c.
gdb will bypass the boost code and break at the point given in b, where you want it
this works but can seem tedious. it's matter of habit. becomes easier with repetition.
msvc behaves similar to gdb
From https://stackoverflow.com/a/31629136/5155476:
I had this same need. I extended the 'skip' command in gdb to support a new type 'dir'. I can now do this in gdb:
skip dir /usr
and then I'm never stopped in any of my 3rd party headers.
Here's a webpage w/ this info + the patch if it helps anyone: info & patch to skip directories in GDB

How do I strip local symbols from linux kernel module without breaking it?

If I do --strip-debug or --strip-unneeded, I have the .ko that lists all function names with nm, if I do just strip foo.ko I have a kernel module that refuses to load.
Does anyone know a quick shortcut how to remove all symbols that are not needed for module loading so that people cannot reverse engineer the API:s as easily?
PS: For all you open source bigots missionaries; this is something that general public will never be using in any case so no need to turn the question into a GPL flame war.
With no answer to my previous questions, here are some guesses that could also be some clues, and a step to an answer:
From what I recall, a .ko is nothing but an .o file resulting from the merge of all the .o files generated by your source module, and the addition of a .modinfo section.
At the end of any .ko building Makefile, there is an LD call: from what I recall, ld is called with the -r option, and this is what create that .o file that the Makefile calls a .ko. This resulting file is not to be confused with an archive or object library (.a file), that is just a format archiving / packaging multiple .o files as one: A merged object is the result of a link that produces yet another .o module: But in the resulting module, all sections that could be merged have been, and all public / external pairs that could be resolved have been inside those sections.
So I assume that you end up with your .ko file containing all your "local" extern definitions:
Those that are extern because they
are used to call across the .o
modules in your .ko (but are not
needed anymore since they are not
supposed to be called from outside
the .ko), and
those that the .ko module DO need to
properly communicate with the loader
and kernel.
The former have most likely already been resolved by ld during the merge, but ld has no way to know whether you intend to have them also callable from outside the .ko.
So the extraneous symbols you see are those that are extern for each of your .o files, but are not needed as extern for the resulting .ko.
And what you are looking for is a way to strip only those.
Does this last paragraph properly describe the symbols you want to get rid of?
I think this is exactly what we are
talking about here.
OK, then it looks like one solution is to "manually" remove the extraneous symbols. The "strip" utility seems to allow individually stripping (or keeping) of symbols, so you would have to use one --strip-all and a small bunch of --keep-symbol= . Note that --wildcard might help a bit, too. You can do the opposite, of course, keep all and individually strip, depending on what's the most convenient.
A good start could be to remove all the symbols that you explicitly defined in your module for cross-module linking and don't want to appear - just leaving the obvious useful ones, things like init and exit. And to not touch those that have been generated by / belong to the kernel dev software infrastructure. Then trial and error until you find the right recipe... In fact, I would think that about all your own symbols might be removable, apart from those you explicitly defined yourself as EXPORT_SYMBOL (and init / exit, of course).
Good luck! :)
PS:
In fact, it seems that the required source information exists in all .ko projects to perform the required stripping automatically: Unless I'm missing something, it seems that anything that's not EXPORT_SYMBOL or explicitly inserted by the build software could theoretically be stripped by default at the end of "ld -r" time that ends a .ko build. It's just that I don't think the toolchain (compiler / linker) have provision / directives / options to individually designate "strip or keep" syms for the relocatable link / merge. Otherwise, some modifications in the EXPORT_SYMBOL macro and in a few other places could probably achieve the result you're after, and shave some bytes from most .ko files in any Linux system.
I just built a kernel without realizing the kernel config had debug symbols enabled, so the size of the resulting modules were quite large. This worked for me:
# du -sh /lib/modules/3.1.0/
1.9G /lib/modules/3.1.0/
# find /lib/modules/3.1.0/ -iname "*.ko" -exec strip --strip-debug {} \;
# du -sh /lib/modules/3.1.0/
134M /lib/modules/3.1.0/
Find all files in /lib/modules/3.1.0 named *.ko and execute strip --strip-debug on each of them.
I'm not sure I understand what the problem really is:
When developing a .ko, if I don't explicitly add something like
ccflags-y += -ggdb -O0 -Wall
into my Makefile, I don't get any symbol but for those that I publish or external ref myself. I'm sure I don't get any other symbols for several good reasons:
the resulting .ko file is considerably smaller,
dumping the file and analyzing the ELF shows the tables are not there,
I can't see nor access the symbols in kgdb.
So I'm a little puzzled at your question, actually?... What are those symbols you do see in your .ko (and don't want to)?
How are they declared in your source file?
In which ELF sections do they end up?
And (sorry, dumb question ahead): Did you define static all things that didn't need to be seen outside of their own module?
In addition to filofel's post:
The reason stripping userspace shared libraries keeps them functioning is because their exported symbols are in the .dynsym section which is never stripped. .ko files however do not use dynsym.
people have reported success with
strip --strip-unneeded
strip -g XXX.
My Previous problem like what you happened is sloved by this command in embedded device with Linux Kernel 3.0.8.

Resources