how to understand the relation between uintptr and struct? - go

I have learned code like the following
func str2bytes(s string) []byte {
x := (*[2]uintptr)(unsafe.Pointer(&s))
h := [3]uintptr{x[0], x[1], x[1]}
return *(*[]byte)(unsafe.Pointer(&h))
}
this function is to change string to []byte without the stage copying data.
I try to convert num to reverseNum
type Num struct {
name int8
value int8
}
type ReverseNum struct {
value int8
name int8
}
func main() {
n := Num{100, 10}
z := (*[2]uintptr)(unsafe.Pointer(&n))
h := [2]uintptr{z[1], z[0]}
fmt.Println(*(*ReverseNum)(unsafe.Pointer(&h))) // print result is {0, 0}
}
this code doesn't get the result I want.
Can anybody tell my about

That's too compilcated.
A simpler
package main
import (
"fmt"
"unsafe"
)
type Num struct {
name int8
value int8
}
type ReverseNum struct {
value int8
name int8
}
func main() {
n := Num{name: 42, value: 12}
p := (*ReverseNum)(unsafe.Pointer(&n))
fmt.Println(p.value, p.name)
}
outputs "42, 12".
But the real question is why on Earth would you want to go for such trickery instead of copying two freaking bytes which is done instantly on any sensible CPU Go programs run on?
Another problem with your approach is that IIUC nothing in the Go language specification guarantees that two types which have seemingly identical fields must have identical memory layouts. I beleive they should on most implementations but I do not think they are required to do that.
Also consider that seemingly innocuous things like also having an extra field (even of type struct{}!) in your data type may do interesting things to memory layouts of the variables of those types, so it may be outright dangerous to assume you may reinterpret memory of Go variables the way you want.

... I just want to learn about the principle behind the package unsafe.
It's an escape hatch.
All strongly-typed but compiled languages have a basic problem: the actual machines on which the compiled programs will run do not have the same typing system as the compiler.1 That is, the machine itself probably has a linear address space where bytes are assembled into machine words that are grouped into pages, and so on. The operating system may also provide access at, say, page granularity: if you need more memory, the OS will give you one page—4096 bytes, or 8192 bytes, or 65536 bytes, or whatever the page size is—of additional memory at a time.
There are many ways to attack this problem. For instance, one can write code directly in machine (or assembly) language, using the hardware's instruction set, to talk to the OS to achieve OS-level things. This code can then talk to the compiled program, acting as the go-between. If the compiled program needs to allocate a 40-byte data structure, this machine-level code can figure out how to do that within the strictures of the OS's page-size allocations.
But writing machine code is difficult and time-consuming. That's precisely why we have high-level languages and compilers in the first place. What if we had a way to, within the high-level language, violate the normal rules imposed by the language? By violating specific requirements in specific ways, carefully coordinating those ways with all other code that also violates those requirements, we can, in code we keep away from the usual application programming, write much of our memory-management, process-management, and so on in our high-level language.
In other words, we can use unsafe (or something similar in other languages) to deliberately break the type-safety provided by our high level language. When we do this—when we break the rules—we must know what all the rules are, and that our specific violations here will function correctly when combined with all the normal code that does obey the normal rules and when combined with all the special, unsafe code that breaks the rules.
This often requires help from the compiler itself. If you inspect the runtime source distributed with Go, you will find routines with annotations like go:noescape, go:noinline, go:nosplit, and go:nowritebarrier. You need to know when and why these are required if you are going to make much use of some of the escape-hatch programming.
A few of the simpler uses, such as tricks to gain access to string or slice headers, are ... well, they are still unsafe, but they are unsafe in more-predictable ways and do not require this kind of close coordination with the compiler itself.
To understand how, when, and why they work, you need to understand how the compiler and runtime allocate and work with strings and slices, and in some cases, how memory is laid out on the hardware, and some of the rules about Go's garbage collector. In particular, the GC code is aware of unsafe.Pointer but not of uintptr. Much of this is pretty tricky: see, e.g., https://utcc.utoronto.ca/~cks/space/blog/programming/GoUintptrVsUnsafePointer and the link to https://github.com/golang/go/issues/19135, in which writing nil to a Go pointer value caused Go's garbage collector to complain, because the write caused the GC to inspect the previously stored value, which was invalid.
1See this Wikipedia article on the Intel 432 for a notable attempt at designing hardware to run compiled high level languages. There have been others in the past as well, often with the same fate, though some IBM projects have been more successful.

Related

What is the best way to evict unused records from string pool?

I am implementing a cache in Golang. Let's say the cache could be implemented as sync.Map with integer key and value as a struct:
type value struct {
fileName string
functionName string
}
Huge number of records have the same fileName and functionName. To save memory I want to use string pool. Go has immutable strings and my idea looks like:
var (
cache sync.Map
stringPool sync.Map
)
type value struct {
fileName string
functionName string
}
func addRecord(key int64, val value) {
fileName, _ := stringPool.LoadOrStore(val.fileName, val.fileName)
val.fileName = fileName.(string)
functionName, _ := stringPool.LoadOrStore(val.functionName, val.functionName)
val.functionName = functionName.(string)
cache.Store(key, val)
}
My idea is to keep every unique string (fileName and functionName) in memory once. Will it work?
Cache implementation must be concurrent safe. The number of records in the cache is about 10^8. The number of records in the string pool is about 10^6.
I have some logic that removes records from the cache. There is no problem with main cache size.
Could you please suggest how to manage string pool size?
I am thinking about storing reference count for every record in the string pool. It will require additional synchronizations or probably global locks to maintain it. I would like to implementation as simple as possible. You can see in my code snippet I don't use additional mutexes.
Or may be I need to follow completely different approach to minimize memory usage for my cache?
What you are trying to do with stringPool is commonly known as string interning. There are libraries like github.com/josharian/intern that provide "good enough" solutions to that kind of problem, and that do not require you to manually maintain the stringPool map. Note that no solution (including yours, assuming you eventually remove some elements from stringPool) can reliably deduplicate 100% of strings without incurring impractical levels of CPU overhead.
As a side note, it's worth pointing out that sync.Map is not really designed for update-heavy workloads. Depending on the keys used, you may actually experience significant contention when calling cache.Store. Furthermore, since sync.Map relies on interface{} for both keys and values, it normally incurs much more allocations that a plain map. Make sure to benchmark with realistic workloads to ensure that you picked the right approach.

How to get the size of "_cl_devide_id" struct in openCL?

cl_device_id is defined as "typedef struct _cl_device_id *cl_device_id".
In the openCL method clGetDeviceIDs, "devices" parameter is of the type "cl_device_id *" and returns a pointer to the list of devices available. I'm trying to pass the whole struct using memcpy to another variable. For which I need to know the size of the "_cl_device_id" struct.
_cl_device_id is internal to the platform (like all _cl_something structs). Furthermore, an OpenCL program can have multiple platforms loaded, and for each platform the struct sizes can (and likely will) be different.
All cl_objects in general are opaque pointers, and (in general) copying around hidden internal structs of a C library is a pretty extreme approach, almost guaranteed to screw things up (unless you're working on implementing a debugger or such).
But anyway leaving motivation aside, the answer is: you can't tell, since you don't know until the program actually runs and loads the OpenCL implementation(s).

Is it necessary to return pointer type in sync.Pool New function?

I saw the issue on Github which says sync.Pool should be used only with pointer types, for example:
var TPool = sync.Pool{
New: func() interface{} {
return new(T)
},
}
Does it make sense? What about return T{} and which is the better choice, why?
The whole point of sync.Pool is to avoid (expensive) allocations. Large-ish buffers, etc. You allocate a few buffers and they stay in memory, available for reuse. Hence the use of pointers.
But here you'll be copying the values on every step, defeating the purpose. (Assuming your T is a "normal" struct and not something like SliceHeader)
It is not necessary. In most cases it should be a pointer as you want to share an object, not to make copies.
In some use cases this can be a non pointer type, like an id of some external resource. I can imagine a pool of paths (mounted disk drives) represented with strings where some large file operations are being conducted.

Stimulate code-inlining

Unlike in languages like C++, where you can explicitly state inline, in Go the compiler dynamically detects functions that are candidate for inlining (which C++ can do too, but Go can't do both). Also there's a debug option to see possible inlining happening, yet there is very few documented online about the exact logic of the go compiler(s) doing this.
Let's say I need to rerun some big loop over a set of data every n-period;
func Encrypt(password []byte) ([]byte, error) {
return bcrypt.GenerateFromPassword(password, 13)
}
for id, data := range someDataSet {
newPassword, _ := Encrypt([]byte("generatedSomething"))
data["password"] = newPassword
someSaveCall(id, data)
}
Aiming for example for Encrypt to being inlined properly what logic should I need to take into consideration for the compiler?
I know from C++ that passing by reference will increase likeliness for automatic inlining without the explicit inline keyword, but it's not very easy to understand what the compiler exactly does to determine the decisions on choosing to inline or not in Go. Scriptlanguages like PHP for example suffer immensely if you do a loop with a constant addSomething($a, $b) where benchmarking such a billion cycles the cost of it versus $a + $b (inline) is almost ridiculous.
Until you have performance problems, you shouldn't care. Inlined or not, it will do the same.
If performance does matter and it makes a noticable and significant difference, then don't rely on current (or past) inlining conditions, "inline" it yourself (do not put it in a separate function).
The rules can be found in the $GOROOT/src/cmd/compile/internal/inline/inl.go file. You may control its aggressiveness with the 'l' debug flag.
// The inlining facility makes 2 passes: first caninl determines which
// functions are suitable for inlining, and for those that are it
// saves a copy of the body. Then InlineCalls walks each function body to
// expand calls to inlinable functions.
//
// The Debug.l flag controls the aggressiveness. Note that main() swaps level 0 and 1,
// making 1 the default and -l disable. Additional levels (beyond -l) may be buggy and
// are not supported.
// 0: disabled
// 1: 80-nodes leaf functions, oneliners, panic, lazy typechecking (default)
// 2: (unassigned)
// 3: (unassigned)
// 4: allow non-leaf functions
//
// At some point this may get another default and become switch-offable with -N.
//
// The -d typcheckinl flag enables early typechecking of all imported bodies,
// which is useful to flush out bugs.
//
// The Debug.m flag enables diagnostic output. a single -m is useful for verifying
// which calls get inlined or not, more is for debugging, and may go away at any point.
Also check out blog post: Dave Cheney - Five things that make Go fast (2014-06-07) which writes about inlining (long post, it's about in the middle, search for the "inline" word).
Also interesting discussion about inlining improvements (maybe Go 1.9?): cmd/compile: improve inlining cost model #17566
Better still, don’t guess, measure!
You should trust the compiler and avoid trying to guess its inner workings as it will change from one version to the next.
There are far too many tricks the compiler, the CPU or the cache can play to be able to predict performance from source code.
What if inlining makes your code bigger to the point that it doesn’t fit in the cache line anymore, making it much slower than the non-inlined version? Cache locality can have a much bigger impact on performance than branching.

Protecting memory from changing

Is there a way to protect an area of the memory?
I have this struct:
#define BUFFER 4
struct
{
char s[BUFFER-1];
const char zc;
} str = {'\0'};
printf("'%s', zc=%d\n", str.s, str.zc);
It is supposed to operate strings of lenght BUFFER-1, and garantee that it ends in '\0'.
But compiler gives error only for:
str.zc='e'; /*error */
Not if:
str.s[3]='e'; /*no error */
If compiling with gcc and some flag might do, that is good as well.
Thanks,
Beco
To detect errors at runtime take a look at the -fstack-protector-all option in gcc. It may be of limited use when attempting to detect very small overflows like the one your described.
Unfortunately you aren't going to find a lot of info on detecting buffer overflow scenarios like the one you described at compile-time. From a C language perspective the syntax is totally correct, and the language gives you just enough rope to hang yourself with. If you really want to protect your buffers from yourself you can write a front-end to array accesses that validates the index before it allows access to the memory you want.

Resources