Easy to read Golang assembly output? - go

I'm interested in examining the x86 assembly output of the standard Go compiler to see if my code is really being converted into reasonably efficient assembly code; hopefully, by profiling and examining the assembly output, I could get a clue as to where/how I should rewrite my Go code for maximum performance. But when I examine the code using the -S flag, Go spits out a mess! I'd like two things:
Is there a way to make the Go compiler dump the assembly output into a file, not just print it out on Terminal?
Also, is there a way to make the Go compiler separate out the assembly code into separate functions, with labels? I know some functions may be inlined and hence not appear in the assembly code. What I'm seeing know is just a homogenous blob of assembly which is almost impossible to understand.

You can redirect the output to a file like this:
go tool compile -S file.go > file.s
You can disable the optimization with -N:
go tool compile -S -N file.go
Alternatively, you can use gccgo:
gccgo -S -O0 -masm=intel test.go
which will generate test.s. You can play with the -O0/1/2/3 to see the different optimizations.

I don't recommend using the output of -S as the Go linker can change what gets written to the object code quite a lot. It does give you some idea as to what is going on.
The go assembler output is rather non-standard too.
When I want to do this I always use objdump which will give you a nice standard assembler output.
Eg for x86 / amd64
objdump -d executable > disassembly
And for ARM (to get the register names to be the same as Go uses)
objdump -M reg-names-raw -d executable > disassembly

Run go tool objdump on the resulting executable file.
To restrict the output to interesting functions, use its -s option.

To dump the output to file:
go tool objdump EXECUTABLE_FILE > ASSEMBLY_FILE
If you want to include the source Go code (assuming you have a working golang setup, and you built the executable yourself):
go tool objdump -S EXECUTABLE_FILE
To make the output even easier to look at I use a small hacky wrapper that produces the following (in a nutshell, it colorizes instructions that alter the control flow -blue for jumps, green for call/return, red for traps, violet for padding- and adds new lines after unconditional control flow jumps):
If you use the wrapper above you will likely want to use the -R switch when piping to less (or by adding it to the environment, e.g. in .bashrc: export LESS="$LESS -R"):
go-objdump EXECUTABLE_FILE | less -R
Alternatively, there is godbolt.org that has probably the most readable output and allows you to switch between compilers (gc, gccgo) and versions very easily.

I had problems with the other answers as the assembly produced provided much more information than I wanted and still not enough details. Let me explain: it provided the assembly for all libraries imported by go internally and did not provide the lines of where my code was (my code was all at the bottom of the file)
Here is what I found from the official docs:
$ GOOS=linux GOARCH=amd64 go tool compile -S x.go # or: go build -gcflags -S x.go
File:
package main
func main() {
println(3)
}
Produces:
--- prog list "main" ---
0000 (x.go:3) TEXT main+0(SB),$8-0
0001 (x.go:3) FUNCDATA $0,gcargs·0+0(SB)
0002 (x.go:3) FUNCDATA $1,gclocals·0+0(SB)
0003 (x.go:4) MOVQ $3,(SP)
0004 (x.go:4) PCDATA $0,$8
0005 (x.go:4) CALL ,runtime.printint+0(SB)
0006 (x.go:4) PCDATA $0,$-1
0007 (x.go:4) PCDATA $0,$0
0008 (x.go:4) CALL ,runtime.printnl+0(SB)
0009 (x.go:4) PCDATA $0,$-1
0010 (x.go:5) RET ,
So what I did was basically:
go tool compile -S hello.go > hello.s
and it got the result I wanted!

A recent alternative would be loov/lensm, which can view assembly and source.
(From Egon Elbre)
To run the program, provide a regular expression filter for the symbol you want to inspect.
-watch allows to automatically reload the executable and information when it changes.
lensm -watch -filter Fibonacci lensm
Note: The program requires a binary that is built on your computer, otherwise the source code for the functions cannot be loaded.
Result:
That could be a nice addition to godbolt.org

The easiest way i found around for Mac users with XCode dev tools is with otool
$ otool -tV <executable>
Source

Related

How can I convert only one file or one function of an elf file to assembly?

I have an elf file of a very big code base (kernel). I want to convert it to assembly code. I have base address of a function and offset of the instruction. Using this information, I want to get the specific instruction. I have used "objdump -b binary -m i386 -D file.elf" to get assembly code from elf file, but it is generating 4GB of data. I have also referred to this Can I give objdump an address and have it disassemble the containing function? but it is also not working for me.
You can limit objdump output with --start-address and --stop-address options.
For process code only for the single function, values for these options can be taken from readelf -s output, which contains start address of the function in the section and the function's size, and from readelf -S output, which contains address of the section with the function:
--start-address=<section_start + function_start>
--stop-address=<section_start + function_start + function_size>
I want to convert it to assembly code.
gdb -q ./elf_file
(gdb) set height 0 # prevent pagination
(gdb) set logging on # output will be mirrored in gdb.txt
(gdb) disassemble 0xffff000008081890 0xffff000008081bf5
(gdb) quit
Enjoy!

Remove file paths from TEXT directives in go binaries

I want to remove all path information like /Users/myuser/dev/go/src/fooapi/spikes/mongoapi.go from the executable that I created with go build.
I'm compiling the code like this:
CGO_ENABLED=0 go build -v -a -ldflags="-w -s" -o ./fooapi spikes/mongoapi.go
Some part of the example assembly from the go build command above:
$ go tool objdump ./fooapi
.
.
TEXT main.init(SB) /Users/myuser/dev/go/src/api/spikes/mongoapi.go
mongoapi.go:60 0x12768c0 65488b0c25a0080000 GS MOVQ GS:0x8a0, CX
mongoapi.go:60 0x12768c9 483b6110 CMPQ 0x10(CX), SP
mongoapi.go:60 0x12768cd 7663 JBE 0x1276932
.
.
Note that: strip is not recommended and can lead to broken executables if you're going to recommend it as a solution.
Use -trimpath flags to remove path information:
CGO_ENABLED=0 go build -v -a -ldflags="-w -s" \
-gcflags=-trimpath=/Users/myuser/dev/go/src \
-asmflags=-trimpath=/Users/myuser/dev/go/src \
-o ./fooapi spikes/mongoapi.go
More Information:
Passing -trimpath to -gcflags and -asmflags will remove any path information from the elf binary.
$ go tool asm -help 2>&1 | grep -A1 trimpath
-trimpath string
remove prefix from recorded source file paths
$ go tool compile -help|grep -A1 trimpath
-trimpath string
remove prefix from recorded source file paths
You can check the result with go tool objdump:
$ go tool objdump ./fooapi
.
.
TEXT main.init(SB) api/spikes/mongoapi.go
mongoapi.go:60 0x12768c0 65488b0c25a0080000 GS MOVQ GS:0x8a0, CX
mongoapi.go:60 0x12768c9 483b6110 CMPQ 0x10(CX), SP
mongoapi.go:60 0x12768cd 7663 JBE 0x1276932
.
.
Using strip tool has still some controversies in go community, although it's been said that it's been fixed. Some say that unknown and unpredictable bugs occur sometimes. Read here and here for examples.
trimpath is a good approach, but had issues like go issue 24976
It appears that, when multiple -trimpath flags are passed to go tool compile, the last one wins
Indeed; from what I can tell the trimpath flag is defined as an ordinary string flag, not a list.
But with CL 173344, this is now fixed (for the upcoming Go 1.13)
cmd/internal/objabi: expand -trimpath syntax
This CL affects the low-level -trimpath flag provided
by both cmd/asm and cmd/compile.
Previously, the flag took the name of a single directory that would be trimmed
from recorded paths in the resulting object file.
This CL makes the flag take a semicolon-separated list of paths.
Further, each path can now end in an optional "=>replacement"
to specify what to replace that leading path prefix with,
instead of only dropping it.
A followup CL will add a mode to cmd/go that uses this
richer -trimpath to build binaries that do not contain any
local path names.
This is CL 173345:
cmd/go: add -trimpath build flag
"go build -trimpath" trims the recorded file paths in the resulting packages and executables to avoid recording the names of any local directories.
Instead, the files appear to be stored in directories named either "go/src/..." (for the standard library) or named after the module or package in which the files appear.
This fixes issue 16860, which is about Go ability to generate bit-for-bit identical binaries, as noted by Ivan Daniluk.

Reason for huge size of compiled executable of Go

I complied a hello world Go program which generated native executable on my linux machine. But I was surprised to see the size of the simple Hello world Go program, it was 1.9MB !
Why is it that the executable of such a simple program in Go is so huge?
This exact question appears in the official FAQ: Why is my trivial program such a large binary?
Quoting the answer:
The linkers in the gc tool chain (5l, 6l, and 8l) do static linking. All Go binaries therefore include the Go run-time, along with the run-time type information necessary to support dynamic type checks, reflection, and even panic-time stack traces.
A simple C "hello, world" program compiled and linked statically using gcc on Linux is around 750 kB, including an implementation of printf. An equivalent Go program using fmt.Printf is around 1.9 MB, but that includes more powerful run-time support and type information.
So the native executable of your Hello World is 1.9 MB because it contains a runtime which provides garbage collection, reflection and many other features (which your program might not really use, but it's there). And the implementation of the fmt package which you used to print the "Hello World" text (plus its dependencies).
Now try the following: add another fmt.Println("Hello World! Again") line to your program and compile it again. The result will not be 2x 1.9MB, but still just 1.9 MB! Yes, because all the used libraries (fmt and its dependencies) and the runtime are already added to the executable (and so just a few more bytes will be added to print the 2nd text which you just added).
Consider the following program:
package main
import "fmt"
func main() {
fmt.Println("Hello World!")
}
If I build this on my Linux AMD64 machine (Go 1.9), like this:
$ go build
$ ls -la helloworld
-rwxr-xr-x 1 janf group 2029206 Sep 11 16:58 helloworld
I get a a binary that is about 2 Mb in size.
The reason for this (which has been explained in other answers) is that we are using the "fmt" package which is quite large, but the binary has also not been stripped and this means that the symbol table is still there. If we instead instruct the compiler to strip the binary, it will become much smaller:
$ go build -ldflags "-s -w"
$ ls -la helloworld
-rwxr-xr-x 1 janf group 1323616 Sep 11 17:01 helloworld
However, if we rewrite the program to use the builtin function print, instead of fmt.Println, like this:
package main
func main() {
print("Hello World!\n")
}
And then compile it:
$ go build -ldflags "-s -w"
$ ls -la helloworld
-rwxr-xr-x 1 janf group 714176 Sep 11 17:06 helloworld
We end up with an even smaller binary. This is as small as we can get it without resorting to tricks like UPX-packing, so the overhead of the Go-runtime is roughly 700 Kb.
Note that the binary size issue is tracked by issue 6853 in the golang/go project.
For instance, commit a26c01a (for Go 1.4) cut hello world by 70kB:
because we don't write those names into the symbol table.
Considering the compiler, assembler, linker, and runtime for 1.5 will be
entirely in Go, you can expect further optimization.
Update 2016 Go 1.7: this has been optimized: see "Smaller Go 1.7 binaries".
But these day (April 2019), what takes the most place is runtime.pclntab.
See "Why are my Go executable files so large? Size visualization of Go executables using D3" from Raphael ‘kena’ Poss.
It is not too well documented however this comment from the Go source code suggests its purpose:
// A LineTable is a data structure mapping program counters to line numbers.
The purpose of this data structure is to enable the Go runtime system to produce descriptive stack traces upon a crash or upon internal requests via the runtime.GetStack API.
So it seems useful. But why is it so large?
The URL https://golang.org/s/go12symtab hidden in the aforelinked source file redirects to a document that explains what happened between Go 1.0 and 1.2. To paraphrase:
prior to 1.2, the Go linker was emitting a compressed line table, and the program would decompress it upon initialization at run-time.
in Go 1.2, a decision was made to pre-expand the line table in the executable file into its final format suitable for direct use at run-time, without an additional decompression step.
In other words, the Go team decided to make executable files larger to save up on initialization time.
Also, looking at the data structure, it appears that its overall size in compiled binaries is super-linear in the number of functions in the program, in addition to how large each function is.

"Illegal instruction" on basic assembly program - not even hello world - why is linking needed?

I just figured this out but instead of splitting my new question ("why?") into another question I think its best if the solution to this problem and an explanation were to be kept on the same page.
I'm writing a basic assembly program to just start and immediately quit using the kernel interrupt at int 0x80. My current code is simply as follows:
/* Simple exit via kern-interrupt */
.globl start
start:
pushl $0x0
movl $0x1, %eax
subl $4, %esp
int $0x80
assembled with
as -arch i386 <file>.s
upon executing I get a one-line error:
Illegal instruction
It's bizzare, even commenting everything out still results in Illegal instruction despite there being no instructions at all. Am I missing a linking step, despite there being no other files to link to? Yes I am
EDIT: Allow me to rephrase my question, why do you need to link when there is no library or anything to link to?
You do need to link it to create an executable. By default, as just gives you an object file, which is something you can link into an executable (either with other object files or on its own) but is not itself a valid executable. Try:
as -arch i386 -o file.o file.s
ld -o file file.o
In answer to your question:
Why do you need to link when there is no library or anything to link to?
Because the assembler doesn't know that you're not going to link with something else.
Unlike the gcc compiler where it assumes you want a program unless told otherwise (with the -c option), as gives you an object file by default. From the manpage:
"as" is primarily intended to assemble the output of the GNU C compiler "gcc" for use by the linker "ld"
If you want a one-step command, you can create a script such as asld:
as -arch i386 -o $1.o $1.s
ld -o $1 $1.o
and then just use asld file.
Or, you could set up makefiles to do all the heavy lifting for you.
You could make the same argument about a C program, I am not using any libraries why do I have to link.
Because that is how the toolchain was designed. One set of tools takes you from source code (any/many languages) to object files which are most of the time incomplete. The link stage, even if as paxdiablo shows, only takes your object file and makes it an executable, is required. If nothing else your .text address is (usually) needed and that comes from the linker stage.
It makes a lot of sense to do it this way, the link stage is complicated enough as it is, make that one tool that does that job and is good at that job. Do your system engineering and define an interface to that tool. The language tools have a complicated job to do have them just do that job, the output being an object file, which is as far as they can resolve without having to become a linker.
If you wish to not use this toolchain and perhaps use nasm or something like that where you can go directly from assembly to binary in one command line step.

Unrolling gcc compiler optimization

I am interested in seeing the code where gcc has actually optimized the code. Is there a way I could do?
I have gone through few other similar questoins, I have tried following few things,
-Wa,ahl=filename.lst :- this option is really good, you can browse the code and corresponding machine code, but it is not good when I enable O3 option.
Dumping optimized tree :- I am sure gcc is giving me good amount of debug information. But I do not how to decipher it. I will be glad if someone could point to any available information.
Is there any other better way, to find out what part of the code gcc optimized?
Thanks,
Madhur
You can compile the code twice, first with:
$ gcc -O0 -S -o yourfile_o0.s
Then with:
$ gcc -O3 -S -o yourfile_o3.s
Then you can diff the two resulting assembly files:
$ diff -u yourfile_o0.s yourfile_o3.s
$ vim -d yourfile_o0.s yourfile_o3.s
$ emacs --eval '(ediff "yourfile_o0.s" "yourfile_o3.s")'
Look at the assember code or decompile your compiled application. C decompilers produce ugly C code, but for analyzing which code was generated, it have to suffice.

Resources