As far as I know, in CPython, open() and read() - the API to read a file is written in C code. The C code probably calls some C library which knows how to make system call.
What about a language such as Go? Isn't Go itself now written in Go? Does Go call C libraries behind the scenes?
The short answer is "it depends".
Go compiles for multiple combinations of H/W and OS, and they all have different approaches to how syscalls are to be made when working with them.
For instance, Solaris does not provide a stable supported set of syscalls, so they go through the systems libc — just as required by the vendor.
Windows does support a rather stable set of syscalls but it is defined as a C API provided by a set of standard DLLs.
The functions exposed by those DLLs are mostly shims which use a single "make a syscall by number" function, but these numbers are not documented and are different between the kernel flavours and releases (perhaps, intentionally).
Linux does provide a stable and documented set of numbered syscalls and hence there Go just calls the kernel directly.
Now keep in mind that for Go to "call the kernel directly" means following the so-called ABI of the H/W and OS combo. For instance, on modern Linux on amd64 making a syscall requires filling a set of CPU registers with certain values, doing some other arrangements and then issuing the SYSENTER CPU instruction.
On Windows, you have to use its native calling convention (which is stdcall, not cdecl).
Yes go is now written in go. But, you don't need C to make syscalls.
An important thing to call out is that syscalls aren't "written in C." You can make syscalls from C on Unix because of <unistd.h>. In particular, how Linux defines this header is a little convoluted, but you can see from this file the general idea. Syscalls are defined with a name and a number. When you call read for example, what really happens behind the scenes is the parameters are setup in the proper registers/memory (linux expects the syscall number in eax) followed by the instruction syscall which fires interrupt 0x80. The OS has already setup the proper interrupt handlers that will receive this interrupt and the OS goes about doing whatever is needed for that syscall. So, you don't need something written in C (or a standard library for that matter) to make syscalls. You just need to understand the call ABI and know the interrupt numbers.
However, as #retgits points out golang's approach is to piggyback off the fact that libc already has all of the logic for handling syscalls. mksyscall.go is a CLI script that parses these libc files to extract the necessary information.
You can actually trace the life of a syscall if you compile a go script like:
package main
import (
"syscall"
)
func main() {
var buf []byte
syscall.Read(9, buf)
}
Run objdump -D on the resulting binary. The go runtime is rather large, so your best bet is to find the main function, see where it calls syscall.Read and then search for the offsets from there: syscall.Read calls syscall.syscall, syscall.syscall calls runtime.libcCall (which switches from the go ABI to C ABI compatibility so that arguments are located where the OS expects--you can see this in runtime, for darwin for example), runtime.libcCall calls runtime.asmcgocall, etc.
For extra fun, run that binary with gdb and continue stepping in until you hit the syscall.
The sys package takes care of the syscalls to the underlying OS. Depending on the OS you're using different packages are used to generate the appropriate calls. Here is a link to the README for Go running on Unix systems: https://github.com/golang/sys/blob/master/unix/README.md the parts on mksyscall.go, which are hand-written Go files which implement system calls that need special handling, and type files, should walk you through how it works.
The Go compiler (which translates the Go code to target CPU code) is written in Go but that is different to the run time support code which is what you are talking about. The standard library is mainly written in Go and probably knows how to directly make system calls with no C code involved. However, there may be a bit of C support code, depending on the target platform.
Related
So I am assuming that BPF_PROG_TYPE_SYSCALL programs are triggered whenever a particular syscall is executed inside the kernel. Can't BPF_PROG_TYPE_KPROBE ebpf programs be used for that purpose? kprobes can hook into any kernel function and syscalls are also kernel functions.
So what is the difference between the two types of programs and when to use which?
You would think that but actually BPF_PROG_TYPE_SYSCALL is a program type which can execute syscalls itself. https://lwn.net/Articles/854228/ It was introduced as an attempt to make one BPF program load another so the first program can be signed with a certificate. But it hasn't caught on very well yet as of writing this.
Indeed if you want to trigger on syscall execution, kprobes are the way to go.
I have read some Go code making use of syscall for low-level interaction with the underlying OS (e.g. Linux or Windows).
I wanted to make use of the same package for native Windows development, but reading its documentation says it's deprecated in favor of golang/x/sys:
$ go doc syscall
package syscall // import "syscall"
Package syscall contains an interface to the low-level operating system
primitives.
...
Deprecated: this package is locked down. Callers should use the
corresponding package in the golang.org/x/sys repository instead. That is
also where updates required by new systems or versions should be applied.
See https://golang.org/s/go1.4-syscall for more information.
Now, reading the documentation for golang/x/sys and inspecting its code, it relies heavily on and encourages the use of the syscall package:
https://github.com/golang/sys/blob/master/windows/svc/example/beep.go
package main
import (
"syscall"
)
var (
beepFunc = syscall.MustLoadDLL("user32.dll").MustFindProc("MessageBeep")
)
func beep() {
beepFunc.Call(0xffffffff)
}
and
https://godoc.org/golang.org/x/sys/windows#example-LoadLibrary
...
r, _, _ := syscall.Syscall(uintptr(proc), 0, 0, 0, 0)
...
Why does golang/x/sys rely and encourage the use of the package it's meant to replace?
Disclaimer: I'm pretty new to Go specifically (though not to low-level OS programming). Still, the path here seems clear.
Go, as an ecosystem—not just the language itself, but all the various libraries as well—tries1 to be portable. But direct system calls are pretty much not portable at all. So there is some tension here automatically.
In order to do anything useful, the Go runtime needs various services from the operating system, such as creating OS-level threads, sending and receiving signals, opening files and network connections, and so on. Many of these operations can be, and have been, abstracted away from how it is done on operating systems A, B, and C to generic concepts supported by most or all OSes. These abstractions build on the actual mechanisms in the various OSes.
They may even do this in layers internally. A look at the Go source for the os package, for instance, shows file.go, file_plan9.go, file_posix.go, file_unix.go, and file_windows.go source files. The top of file_posix.go showss a +build directive:
// Copyright 2009 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build aix darwin dragonfly freebsd js,wasm linux nacl netbsd openbsd solaris windows
Clearly this code itself is not completely portable, but the routines it implements for os, which are wrapped by the os.File abstraction, suffice for all POSIX-conformant systems. That reduces the amount of code that has to go in the Unix/Linux-specific files_unix.go file for instance.
To the extent that OS-level operations can be wrapped into more-abstract, more-portable operations, then, the various built-in Go packages do this. You don't need to know whether there's a different system call for opening a device-file vs a text-file vs a binary-file, for instance, or a long pathname vs a short one: you just call os.Create or os.Open and it does any work necessary behind the scenes.
This whole idea just doesn't fly with system calls. A Linux system call to create a new UID namespace has no Windows equivalent.2 A Windows WaitForMultipleObjects system call has no real equivalent on Linux. The low level details of a stat/lstat call differ from one system to another, and so on.
In early versions of Go, there was some attempt to paper over this with the syscall package. But the link you quoted—https://golang.org/s/go1.4-syscall—describes this attempt as, if not failed, at least overstretched. The last word in the "problems" section is "issues".
The proposal at this same link says that the syscall package is to be frozen (or mostly frozen) as of Go 1.4: don't put new features into it. But the features that are in it are sufficient to implement the new, experimental golang.org/x/sys/* packages, or at least some of them. There's no harm in the experimental package borrowing the existing, formally-deprecated syscall package if that does what the experimental new package needs.
Things in golang.org/x/ are experimental: feel free to use them, but be aware that there are no compatibility promises across version updates, unlike things in the standard packages. So, to answer the last line of your question:
Why does golang/x/sys rely [on] and encourage the use of the package it's meant to replace?
It relies on syscall because that's fine. It doesn't "encourage the use of" syscall at all though. It just uses it when that's sufficient. Should that become insufficient, for whatever reason, it will stop relying on it.
Answering a question you didn't ask (but I did): suppose you want Unix-specific stat information about a file, such as its inode number. You have a choice:
info, err := os.Stat(path) // or os.Lstat(path), etc
if err != nil { ... handle error ... }
raw, ok := info.Sys().(*syscall.Stat_t)
if !ok { ... do whatever is appropriate ... }
inodeNumber := raw.Ino
or:
var info unix.Stat
err := unix.Stat(path, &info) // or unix.Lstat, etc
if err != nil { ... handle error ... }
inodeNumber := unix.Ino
The advantage to the first block of code is that you get all the other (portable) information about the file—its mode and size and time-stamps, for instance. You maybe do, maybe don't get the inode number; the !ok case tells you whether you did. The primary disadvantage here is that it takes more code to do this.
The advantage to the second block of code is that it says just what you mean. You either get all the information from the stat call, or none of it. The disadvantages are obvious:
it only works on Unix-ish systems, and
it uses an experimental package, whose behavior might change.
So it's up to you which of these matters more to you.
1Either this is a metaphor, or I've just anthropomorphized this. There's an old rule: Don't anthropomorphize computers, they hate that!
2A Linux UID namespace maps from UIDs inside a container to UIDs outside the container. That is, inside the container, a file might be owned by UID 1234. If the file is in a file system that is also mounted outside the container, that file can be owned by a different owner, perhaps 5678. Changing the ownership on either "side" of the container makes the change in that side's namespace; the change shows up on the other side as the result of mapping, or reverse-mapping, the ID through the namespace mapping.
(This same trick also works for NFS UID mappings, for instance. The Docker container example above is just one use, but probably the most notable one these days.)
I've read this tutorial
I could follow the guide and run the code. but I have questions.
1) Why do we need both load-address and run-time address. As I understand it is because we have put .data at flash too; so why we don't run app there, but need start-up code to copy it into RAM?
http://www.bravegnu.org/gnu-eprog/c-startup.html
2) Why we need linker script and start-up code here. Can I not just build C source as below and run it with qemu?
arm-none-eabi-gcc -nostdlib -o sum_array.elf sum_array.c
Many thanks
Your first question was answered in the guide.
When you load a program on an operating system your .data section, basically non-zero globals, are loaded from the "binary" into the right offset in memory for you, so that when your program starts those memory locations that represent your variables have those values.
unsigned int x=5;
unsigned int y;
As a C programmer you write the above code and you expect x to be 5 when you first start using it yes? Well, if are booting from flash, bare metal, you dont have an operating system to copy that value into ram for you, somebody has to do it. Further all of the .data stuff has to be in flash, that number 5 has to be somewhere in flash so that it can be copied to ram. So you need a flash address for it and a ram address for it. Two addresses for the same thing.
And that begins to answer your second question, for every line of C code you write you assume things like for example that any function can call any other function. You would like to be able to call functions yes? And you would like to be able to have local variables, and you would like the variable x above to be 5 and you might assume that y will be zero, although, thankfully, compilers are starting to warn about that. The startup code at a minimum for generic C sets up the stack pointer, which allows you to call other functions and have local variables and have functions more than one or two lines of code long, it zeros the .bss so that the y variable above is zero and it copies the value 5 over to ram so that x is ready to go when the code your entry point C function is run.
If you dont have an operating system then you have to have code to do this, and yes, there are many many many sandboxes and toolchains that are setup for various platforms that already have the startup and linker script so that you can just
gcc -O myprog.elf myprog.c
Now that doesnt mean you can make system calls without a...system...printf, fopen, etc. But if you download one of these toolchains it does mean that you dont actually have to write the linker script nor the bootstrap.
But it is still valuable information, note that the startup code and linker script are required for operating system based programs too, it is just that native compilers for your operating system assume you are going to mostly write programs for that operating system, and as a result they provide a linker script and startup code in that toolchain.
1) The .data section contains variables. Variables are, well, variable -- they change at run time. The variables need to be in RAM so that they can be easily changed at run time. Flash, unlike RAM, is not easily changed at run time. The flash contains the initial values of the variables in the .data section. The startup code copies the .data section from flash to RAM to initialize the run-time variables in RAM.
2) Linker-script: The object code created by your compiler has not been located into the microcontroller's memory map. This is the job of the linker and that is why you need a linker script. The linker script is input to the linker and provides some instructions on the location and extent of the system's memory.
Startup code: Your C program that begins at main does not run in a vacuum but makes some assumptions about the environment. For example, it assumes that the initialized variables are already initialized before main executes. The startup code is necessary to put in place all the things that are assumed to be in place when main executes (i.e., the "run-time environment"). The stack pointer is another example of something that gets initialized in the startup code, before main executes. And if you are using C++ then the constructors of static objects are called from the startup code, before main executes.
1) Why do we need both load-address and run-time address.
While it is in most cases possible to run code from memory mapped ROM, often code will execute faster from RAM. In some cases also there may be a much larger RAM that ROM and application code may compressed in ROM, so the executable code may not simply be copied from ROM also decompressed - allowing a much larger application than the available ROM.
In situations where the code is stored on non-memory mapped mass-storage media such as NAND flash, it cannot be executed directly in any case and must be loaded into RAM by some sort of bootloader.
2) Why we need linker script and start-up code here. Can I not just build C source as below and run it with qemu?
The linker script defines the memory layout of you target and application. Since this tutorial is for bare-metal programming, there is no OS to handle that for you. Similarly the start-up code is required to at least set an initial stack-pointer, initialise static data, and jump to main. On an embedded system it is also necessary to initialise various hardware such as the PLL, memory controllers etc.
Where can I find the source code of some of the system calls? For example, I am looking for the implementation of fstat as described here.
A system call is mostly implemented inside the Linux kernel, with a tiny glue code in the C standard library. But see also vdso(7).
From the user-land point of view, a system call (they are listed in syscalls(2)...) is a single machine instruction (often SYSENTER) with some calling conventions (e.g. defining which machine register hold the syscall number - e.g. __NR_stat from /usr/include/asm/unistd_64.h....-, and which other registers contain the arguments to the system call).
Use strace(1) to understand which system calls are done by a given program or process.
The C standard library has a tiny wrapper function (which invokes the kernel, following the ABI, and deals with error reporting & errno).
For stat(2), the C wrapping function is e.g. in stat/stat.c for musl-libc.
Inside the kernel code, most of the work happens in fs/stat.c (e.g. after line 207).
See also this & that answers
Is it possible to pin a softirq, or any other bottom half to a processor. I have a doubt that this could be done from within a softirq code.
But then inside a driver is it possible to pin a particular IRQ to a
core.
From user mode, you can easily do this by writing to /proc/irq/N/smp_affinity to control which processor(s) an interrupt is directed to. The symbols for the code implementing this are not exported though, so it's difficult to do from the kernel (at least for a loadable module which is how most drivers are structured).
The fact that the implementing function symbols aren't exported is a sign that the kernel developers don't want to encourage this. Presumably that's because it takes control away from the user. And also embeds assumptions about number of processors and so forth into the driver.
So, to answer your question, yes, it's possible, but it's discouraged, and you would need to do one of several "ugly" things to implement it ((a) change kernel exports, (b) link your driver statically into main kernel, or (c) open/write to the proc file from kernel mode).
The usual way to achieve this is by writing a user-mode program (can even be a shell script) that programs core numbers/masks into the appropriate proc file. See Documentation/IRQ-affinity.txt in the kernel source directory for details.