How does GC work without a separate runtime or VM? - go

My understanding is that executables of applications written in Go can stand alone without the need of Go installed in the machine.
Normally my understanding is that the GC (Garbage Collection) is handled by a VM. In this case, if the application is running independently without such a runtime how is GC handled?
A help on this and the documentation on the same would be nice.

my understanding is that the GC (Garbage Collection) is handled by a VM.
In the case of a typical VM supporting programming language
featuring GC, (the compiled form of) a program written in that
language is literally managed by the VM: the VM runs the code of the program and intervenes periodically to perform the GC tasks.
The crucial point is that each program running in such a VM
may consider its VM as a part of its execution environment.
Another crucial point is that such VM represents the so-called
runtime system
for the so-called execution model of that programming language.
In this case, if the application is running independently without such a runtime how is GC handled?
Quite similar to the VM case.
Each Go program compiled by the stock toolchain (which can be downloaded from the official site
contains the Go runtime linked with the program itself.
Each compiled Go program is created in a way that when the program runs, the program's entry point executes the runtime first
which is responsible for initializing itself, then the program, and once this is finished, the execution is transferred to the program's main().
Among other things, the initialized Go runtime continuously
runs one or more pieces of code of its own, which includes the
goroutine scheduler and the GC (they are tightly coupled FWIW).
As you can see, the difference from the VM is that in that case
the runtime is "external" to the running program while in the
(typical) case of Go programs it's "along" the running program.
Nothing in the Go language specification mandates the
precise way the runtime must be made available to the
running program.
For instance, Go 1.11 can be compiled to
WASM, and the runtime is partially
provided by the linked-in code of the Go runtime
and partially—by the WASM host (typically a browser).
As another example, GCC
features a Go frontend, and contrary to the "stock"
Go toolchan, and on those platforms where it's possible,
GCC supports building Go in a way where their compiled forms
dynamically link against a shared library containing most
of the Go runtime code (and the code of the standard library).
In this case, a compiled Go program does not contain the runtime
code but it gets linked in when the program is being loaded
and then it also works in-process with the program itself.
It's perfectly possible to implement an execution model for
Go programs which would use a VM.

Binaries in golang have no external dependencies as they are directly compiled in. Unlike a C/C++ binary which typically requires dynamic linking, golang binaries do not require this by default.
https://golang.org/doc/faq#runtime
This allows you to copy, scp , rsync, etc your binary across to any machine of the same architecture type. For example, if you have a compiled binary on Ubuntu then you can copy that binary across to any other Ubuntu machine. You would have to cross-compile your binary for MacOS in order to do that, but again you can build on any operating system.
https://golang.org/doc/install/source#environment

Related

Difference between compiling code and executing code

So I write a program in some language, LanguageX, using a simple text pad.
Then I put that text into a Compiler. The compiler outputs machine code (or assembly which is then compiled into machine code)
My question is, who actually executes the compiled program.
Does the compiler execute it? Or do I need another "executor app" execute it?
Or does the hardware execute the program directly? But who orders the hardware to do that?
I'm confused because the concepts of compiling a program, and executing a program, seem to be used interchangeably.
An example is HTML. I can write html code in a text file and save it as .html, open it with Firefox, and it will run. Is Firefox a compiler, an executor, both, neither?
Another example is a commercial app I buy and install. Whenever I click on the .exe, is the app compiled or executed? Both?
A program is data that explains how to execute things. You can read the program yourself and have a sense of what it should be doing, or give it to another program that can execute it. When a program is directly executed from its source, it is said to be "interpreted".
For example, your browser is interpreting HTML to render a page. When there is Javascript associated with a page, this is loaded and executed by a Javascript interpreter that has access to your page and its elements. The Javascript interpreter is part of your browser program, and your browser is a program that is executed by a processor.
A compiler is a program that transforms the source code into another language, typically instructions that can be decoded by your CPU, but that may be also bytes not directly executable by a processor, but by a virtual machine (another program that knows how to interpret the byte code).
For some languages the compilation phase also involves also a step called linking, but the resulting file is basically a bit of metadata and a sequence of instructions your processor can understand.
In order to execute a program, you ask (through a shell or the graphical interface) your operating systems to load the program: the kernel allocates resources for your process and put the code in a portion of memory that is flagged as executable (there are a lot more details to it).
The kernel keeps track of processes and executes them in one or more processors. The code is being fed directly to a processor which can decode the instructions resulting from compilation. Periodically, a process is interrupted by the kernel to let other processes run (when the process is waiting for something, or due to something called an "interrupt"). When you have multiple processors, multiple programs can execute truly in parallel.
See for example Linux Kernel Teaching for a lot more details.

At what point does a program become a process virtual machine?

At what point does a program who's purpose is to be a runtime become a (process) virtual machine ? What qualifies a program to be called a virtual machine in contrast to a humble runtime ? Trying to read about real world software does not clarify the distinction.
I am not sure I understand your notion of "runtime". Typically, this word is used to highlight that something happens when a program already runs, not before (at e.g. compile time) or after (e.g. when it crashed and got closed) that. A virtual machine is a concept when one program interprets its own data as another program written in certain language to be executed.
Both programs compiled into a native machine language or to some sort of virtual machine language may need a runtime component to execute. Examples:
A program compiled from C++ into machine code needs system libraries that implement standard operations, such as math libraries linked dynamically to it, as well as operating system services, such as file and network input-output
a Java program compiled into bytecode needs JVM to interpret it, as well as services of memory allocation, garbage collection, thread scheduling etc. from it.
Neither libstdc++ not JVM are present in a program's binary code, they are attached at its run time, hence the name.
At what point does a program who's purpose is to be a runtime become a (process) virtual machine?
Any program meant for execution is a runtime. If it is running, that is. If it is only being stored on a disk, it is not at it run time (rather, "wait time" or "non-existence time"). If such a program is written to execute other programs inside itself, it can be considered some sort of a virtual machine.
What qualifies a program to be called a virtual machine in contrast to a humble runtime?
The word "runtime" is very vague; you should qualify it further, e.g. "runtime library", "runtime analysis", "runtime support" etc. The phrase "virtual machine" is more specific: a "hello world" is typically not a VM, neither is a program to solve a system of linear equations; both of them execute a static algorithm. An interpreter of e.g. Python language is a VM, because what it does is largely defined by the data (another program) it processes, not by the algorithm of the interpreter itself.

how to parallelize "make" command which can distribute task on multiple machine

I been compiling a ".c / .c++" code which takes 1.5hour to compile on 4 core machine using "make" command.I also have 10 more machine which i can use for compiling. I know "-j" option in "make" which distribute compilation in specified number of threads. but "-j " option distribute threads only on current machine not on other 10 machine which are connected in network.
we can use MPI or other parallel programing technique but we need to rewrite "MAKE" command implementation according to parallel programing language.
Is there is any other way by which we can make use of other available machine for compilation???
thanks
Yes, there is: distcc.
distcc is a program to distribute compilation of C or C++ code across
several machines on a network. distcc should always generate the same
results as a local compile, is simple to install and use, and is often
two or more times faster than a local compile.
Unlike other distributed build systems, distcc does not require all
machines to share a filesystem, have synchronized clocks, or to have
the same libraries or header files installed. Machines can be running
different operating systems, as long as they have compatible binary
formats or cross-compilers.
By default, distcc sends the complete preprocessed source code across
the network for each job, so all it requires of the volunteer machines
is that they be running the distccd daemon, and that they have an
appropriate compiler installed.
They key is that you still keep your single make, but gcc the arranges files appropriately (running preprocessor, headers, ... locally) but arranges for the compilation to object code over the network.
I have used it in the past, and it is pretty easy to setup -- and helps in exactly your situation.
https://github.com/icecc/icecream
Icecream was created by SUSE based on distcc. Like distcc, Icecream takes compile jobs from a build and distributes it among remote machines allowing a parallel build. But unlike distcc, Icecream uses a central server that dynamically schedules the compile jobs to the fastest free server. This advantage pays off mostly for shared computers, if you're the only user on x machines, you have full control over them.

How does gdb set software breakpoints in shared library functions?

I know that software breakpoints in an executable file can work through replacing some assembler instruction at the desired place with another one, which cause interrupt. So debugger can stop execution exactly at this place and replace this instruction with original one and ask user about what to do the next or call some commands and etc.
But code of such executable file is not used by another programs and has only one copy in memory. How can software breakpoints work with a shared libraries? For instance, how software breakpoints work if I set one at some internal function of C-library (as I understand it has only one copy for all the applications, so we cannot just replace some instruction in it)? Are there any "software breakpoints" techniques for that purpose?
The answer for Linux is that the Linux kernel implements COW (Copy-on-Write): If the code of a shared library is written to, the kernel makes a private duplicate copy of the shared page first, remaps internally virtual memory just for that process to the copy, and allows the application to continue. This is completely invisible to userland applications and done entirely in the kernel.
Thus, until the first time a software breakpoint is put into the shared library, its code is indeed shared; But afterwards, not. The process thereafter operates with a dirty but private copy.
This kernel magic is what allows the debugger to not cause every other application to suddenly stop.
On OSes such as VxWorks, however, this is not possible. From personal experience, when I was implementing a GDB remote debug server for VxWorks, I had to forbid my users from ever single-stepping within semTake() and semGive() (the OS semaphore functions), since a) GDB uses software breakpoints in its source-level single-step implementation and b) VxWorks uses a semaphore to protect its breakpoints list...
The unpleasant consequence was an interrupt storm in which a breakpoint would cause an interrupt, and within this interrupt there would be another interrupt, and another and another in an unescapable chain resistant even to Ctrl-Z. The only way out was to power off the machine.

Windows system call issue in W2K8

I am having a problem with windows system function “EnumProcessModules()” that is defined in psapi.dll. In our component we use this function to retrieve modules in a specified process. This function is working well as long as we run the program on a 32-bit OS. However, this function fails when we run the program on a 64-bit OS (e.g. W2K8 R2). As you all know we are targeting Clay and Brick on W2K8 R2s. This is a known problem as per the following discussion in MSDN. One work around that was suggested in that thread is to compile the code as 64-bit. To us that is not an option, at least not yet. Do you have any suggestions? Any pointers/suggestions/ideas will be appreciated.
http://social.msdn.microsoft.com/forums/en-US/winserver2008appcompatabilityandcertification/thread/c7d7e3fe-f8e5-49c3-a16f-8e3dec5e8cf8/
If your existing code must continue being compiled as 32-bit, one possibility would be to create a small 64-bit executable that enumerates the processes via EnumProcessModulesEx. The 32-bit process could spawn the 64-bit process when necessary to do that work. Then use some kind of IPC to transfer the information back to the 32-bit process. Depending on what is needed, that part could be as low tech as writing a file to disk and reading it from the first process (or pipes, shared memory, sockets, etc.).

Resources