Is there an idiomatic way to malloc and memcpy a struct? - go

In plain C, if I want a shallow heap copy of a struct, I would malloc() and memcpy() it.
In Go, I guess I have to do something like this:
original := Data{...}
copy := &Data{} // malloc
*copy = original // memcpy
But it doesn't look nice to me, nor idiomatic. What's the "right" way to do it?

The idiomatic way is to do a simple assignment and let the compiler allocate copy on the heap after performing escape analysis:
original := Data{...}
copy := original
return &copy // Or call some function with &copy as a parameter
Upon noticing that copy is used by reference and outlives the stack, Go will automatically allocate it on the heap rather than on the stack (the copy is still done properly of course)
We effectively no longer care about the heap, letting the compiler allocate it there as needed based on escape analysis. Our only concern is the copy itself.
You can see an example in action on godbolt:
Given the following simple code:
func main() {
type Data struct{
foo string
}
original := Data{"hi"}
copy := original
copyPtr := &copy
fmt.Println(copyPtr)
}
Go will automatically allocate copy on the heap:
call runtime.newobject(SB)
We can also see this in action by passing extra flags at compile time showing escape and inlining decisions:
$ go build -gcflags '-m' .
...
./main.go:11:2: moved to heap: copy
...
Note: copy is a builtin function. It might be a good idea to avoid reusing the name (it works just fine, but it's not great practice).

A struct variable in Golang can be copied to another simply by an assignment statement:
https://play.golang.org/p/4Zcbxhy5UoB
package main
import (
"fmt"
)
type User struct {
name string
}
func main() {
u1 := User{name: "foo"}
u2 := u1
u2.name = "bar"
fmt.Println("u1: ", u1)
fmt.Println("u2: ", u2)
}
output:
u1: {foo}
u2: {bar}

Related

Go vet reports "possible misuse of reflect.SliceHeader"

I have the following code snippet which "go vet" complains about with the warning "possible misuse of reflect.SliceHeader". I can not find very much information about this warning other then this. After reading that it is not very clear to me what is needed to do this in a way that makes go vet happy - and without possible gc issues.
The goal of the snippet is to have a go function copy data to memory which is managed by an opaque C library. The Go function expects a []byte as a parameter.
func Callback(ptr unsafe.Pointer, buffer unsafe.Pointer, size C.longlong) C.longlong {
...
sh := &reflect.SliceHeader{
Data: uintptr(buffer),
Len: int(size),
Cap: int(size),
}
buf := *(*[]byte)(unsafe.Pointer(sh))
err := CopyToSlice(buf)
if err != nil {
log.Fatal("failed to copy to slice")
}
...
}
https://pkg.go.dev/unsafe#go1.19.4#Pointer
Pointer represents a pointer to an arbitrary type. There are four
special operations available for type Pointer that are not available
for other types:
A pointer value of any type can be converted to a Pointer.
A Pointer can be converted to a pointer value of any type.
A uintptr can be converted to a Pointer.
A Pointer can be converted to a uintptr.
Pointer therefore allows a program to defeat the type system and read
and write arbitrary memory. It should be used with extreme care.
The following patterns involving Pointer are valid. Code not using
these patterns is likely to be invalid today or to become invalid in
the future. Even the valid patterns below come with important caveats.
Running "go vet" can help find uses of Pointer that do not conform to
these patterns, but silence from "go vet" is not a guarantee that the
code is valid.
(6) Conversion of a reflect.SliceHeader or reflect.StringHeader Data
field to or from Pointer.
As in the previous case, the reflect data structures SliceHeader and
StringHeader declare the field Data as a uintptr to keep callers from
changing the result to an arbitrary type without first importing
"unsafe". However, this means that SliceHeader and StringHeader are
only valid when interpreting the content of an actual slice or string
value.
var s string
hdr := (*reflect.StringHeader)(unsafe.Pointer(&s)) // case 1
hdr.Data = uintptr(unsafe.Pointer(p)) // case 6 (this case)
hdr.Len = n
In this usage hdr.Data is really an alternate way to refer to the
underlying pointer in the string header, not a uintptr variable
itself.
In general, reflect.SliceHeader and reflect.StringHeader should be used only as *reflect.SliceHeader and *reflect.StringHeader pointing at actual slices or strings, never as plain structs. A program should not declare or allocate variables of these struct types.
// INVALID: a directly-declared header will not hold Data as a reference.
var hdr reflect.StringHeader
hdr.Data = uintptr(unsafe.Pointer(p))
hdr.Len = n
s := *(*string)(unsafe.Pointer(&hdr)) // p possibly already lost
It looks like JimB (from the comments) hinted upon the most correct answer, though he didn't post it as an answer and he didn't include an example. The following passes go vet, staticcheck, and golangci-lint - and doesn't segfault so I think it is the correct answer.
func Callback(ptr unsafe.Pointer, buffer unsafe.Pointer, size C.longlong) C.longlong {
...
buf := unsafe.Slice((*byte)(buffer), size)
err := CopyToSlice(buf)
if err != nil {
log.Fatal("failed to copy to slice")
}
...
}

how to know golang allocated variable on the heap or the stack?

i read the golang FAQ:https://go.dev/doc/faq#stack_or_heap,i want to know when golang allocate variable on stack or heap. so i write code like below :
package main
import (
"fmt"
)
type Object struct {
Field int
}
func main() {
A := Object{1}
B := Object{2}
fmt.Println(A,B)
//fmt.Printf("A:%p;B:%p\n",&A,&B)
//m := testStackOrHeap()
//C:=m[A]
//D:=m[B]
//fmt.Printf("C:%p;D:%p\n",&C,&D)
}
//go:noinline
func testStackOrHeap() map[Object]Object {
one:=1
two:=2
A := Object{one}
B := Object{two}
C:= Object{one}
D := Object{two}
fmt.Println(C,D)
fmt.Printf("A:%p;B:%p\n",&A,&B)
m := map[Object]Object{A: A, B: B}
return m
}
then see how the compiler allocate the memory .the cmd is go tool compile "-m" main.go
the output is below :
main.go:15:13: inlining call to fmt.Println
main.go:30:13: inlining call to fmt.Println
main.go:31:12: inlining call to fmt.Printf
main.go:15:13: A escapes to heap
main.go:15:13: B escapes to heap
main.go:15:13: []interface {} literal does not escape
main.go:26:2: moved to heap: A
main.go:27:2: moved to heap: B
main.go:30:13: C escapes to heap
main.go:30:13: D escapes to heap
main.go:30:13: []interface {} literal does not escape
main.go:31:12: []interface {} literal does not escape
main.go:32:24: map[Object]Object literal escapes to heap
<autogenerated>:1: .this does not escape
my question is:
why not golang allocate variable A B in testStackOrHeap() to the stack ,they can not escape to stackframe ,if it allocate to heap , the gcworker need to collect it,but if it allocate in stack, it will release when function return.
As #Volker pointed out in a comment, the heap/stack distinction is an implementation detail, and the rules for escape analysis are defined by the compiler, not by the language. Correctness is the most important trait of a compiler, so a compiler's rules will frequently favor simplicity and performance over absolute "optimalness".
In this case, it's quite likely that the compiler doesn't know what fmt.Printf() will do with the pointers it receives. Therefore, it has to assume that the pointers might be stored somewhere on the heap by that function and that the references to those two objects might thus survive the call to testStackOrHeap(). Therefore, it errs on the side of caution and promotes those two variables to the heap.
(Note that your conclusion that they do not escape was presumably based on an assumption that fmt.Printf() won't store the pointers. Did you actually read the source code of that function to learn that it doesn't? If not, you can't actually be sure that it doesn't - just like the compiler isn't sure. And even if the current version of that function doesn't, future versions might.)

Lifetime of local variable appended as a pointer in Go

I'm learning Go and have a C/C++ background. In the following example, is it safe to append the address of a into slice? When I run this example, the correct value (2) is printed, but wanted to be sure. If this is wrong, how should I do it?
func add(mapping map[string]*[]*int) {
sliceptr := &[]*int{}
mapping["foo"] = sliceptr
ele := mapping["foo"]
a := 2
// won't address of `a` go out of scope?
ele2 := append(*ele, &a)
mapping["foo"] = &ele2
}
func main() {
mapping := map[string]*[]*int{}
add(mapping)
fmt.Println(*(*mapping["foo"])[0])
}
It's safe to reference a after the function declaring it ends, because go does escape analysis. If the compiler can prove it can be accessed safely, it puts it on the stack, if not, it allocates it on the heap.
Build flags can give some insight into the escape analysis:
go build -gcflags "-m" main.go
...
./main.go:10:2: moved to heap: a
...
This might be helpful: Allocation efficiency.
Also, it's less common to see pointers to slices, since a slice is small: a pointer, length and capacity. See slice internals.

Println changes capacity of a slice

Consider the following code
package main
import (
"fmt"
)
func main() {
x := []byte("a")
fmt.Println(x)
fmt.Println(cap(x) == cap([]byte("a"))) // prints false
y := []byte("a")
fmt.Println(cap(y) == cap([]byte("a"))) // prints true
}
https://play.golang.org/p/zv8KQekaxH8
Calling simple Println with a slice variable, changes its capacity. I suspect calling any function with variadic parameters of ...interface{} produces the same effect. Is there any sane explanation for such behavior?
The explanation is, like bradfitz point in github, if you don't use make to create a slice, the compiler will use the cap it believes convenient. Creating multiple slices in different versions, or even the same, can result on slices of different capacities.
In short, if you need a concrete capacity, use make([]byte, len, cap). Otherwise you can't trust on a fixed capacity.

Can I Use the Address of a returned value? [duplicate]

What's the cleanest way to handle a case such as this:
func a() string {
/* doesn't matter */
}
b *string = &a()
This generates the error:
cannot take the address of a()
My understanding is that Go automatically promotes a local variable to the heap if its address is taken. Here it's clear that the address of the return value is to be taken. What's an idiomatic way to handle this?
The address operator returns a pointer to something having a "home", e.g. a variable. The value of the expression in your code is "homeless". if you really need a *string, you'll have to do it in 2 steps:
tmp := a(); b := &tmp
Note that while there are completely valid use cases for *string, many times it's a mistake to use them. In Go string is a value type, but a cheap one to pass around (a pointer and an int). String's value is immutable, changing a *string changes where the "home" points to, not the string value, so in most cases *string is not needed at all.
See the relevant section of the Go language spec. & can only be used on:
Something that is addressable: variable, pointer indirection, slice indexing operation, field selector of an addressable struct, array indexing operation of an addressable array; OR
A composite literal
What you have is neither of those, so it doesn't work.
I'm not even sure what it would mean even if you could do it. Taking the address of the result of a function call? Usually, you pass a pointer of something to someone because you want them to be able to assign to the thing pointed to, and see the changes in the original variable. But the result of a function call is temporary; nobody else "sees" it unless you assign it to something first.
If the purpose of creating the pointer is to create something with a dynamic lifetime, similar to new() or taking the address of a composite literal, then you can assign the result of the function call to a variable and take the address of that.
In the end you are proposing that Go should allow you to take the address of any expression, for example:
i,j := 1,2
var p *int = &(i+j)
println(*p)
The current Go compiler prints the error: cannot take the address of i + j
In my opinion, allowing the programmer to take the address of any expression:
Doesn't seem to be very useful (that is: it seems to have very small probability of occurrence in actual Go programs).
It would complicate the compiler and the language spec.
It seems counterproductive to complicate the compiler and the spec for little gain.
I recently was tied up in knots about something similar.
First talking about strings in your example is a distraction, use a struct instead, re-writing it to something like:
func a() MyStruct {
/* doesn't matter */
}
var b *MyStruct = &a()
This won't compile because you can't take the address of a(). So do this:
func a() MyStruct {
/* doesn't matter */
}
tmpA := a()
var b *MyStruct = &tmpA
This will compile, but you've returned a MyStruct on the stack, allocated sufficient space on the heap to store a MyStruct, then copied the contents from the stack to the heap. If you want to avoid this, then write it like this:
func a2() *MyStruct {
/* doesn't matter as long as MyStruct is created on the heap (e.g. use 'new') */
}
var a *MyStruct = a2()
Copying is normally inexpensive, but those structs might be big. Even worse when you want to modify the struct and have it 'stick' you can't be copying then modifying the copies.
Anyway, it gets all the more fun when you're using a return type of interface{}. The interface{} can be the struct or a pointer to a struct. The same copying issue comes up.
You can't get the reference of the result directly when assigning to a new variable, but you have idiomatic way to do this without the use of a temporary variable (it's useless) by simply pre-declaring your "b" pointer - this is the real step you missed:
func a() string {
return "doesn't matter"
}
b := new(string) // b is a pointer to a blank string (the "zeroed" value)
*b = a() // b is now a pointer to the result of `a()`
*b is used to dereference the pointer and directly access the memory area which hold your data (on the heap, of course).
Play with the code: https://play.golang.org/p/VDhycPwRjK9
Yeah, it can be annoying when APIs require the use of *string inputs even though you’ll often want to pass literal strings to them.
For this I make a very tiny function:
// Return pointer version of string
func p(s string) *string {
return &s
}
and then instead of trying to call foo("hi") and getting the dreaded cannot use "hi" (type string) as type *string in argument to foo, I just wrap the argument in a call to to p():
foo(p("hi"))
a() doesn't point to a variable as it is on the stack. You can't point to the stack (why would you ?).
You can do that if you want
va := a()
b := &va
But what your really want to achieve is somewhat unclear.
At the time of writing this, none of the answers really explain the rationale for why this is the case.
Consider the following:
func main() {
m := map[int]int{}
val := 1
m[0] = val
v := &m[0] // won't compile, but let's assume it does
delete(m, 0)
fmt.Println(v)
}
If this code snippet actually compiled, what would v point to!? It's a dangling pointer since the underlying object has been deleted.
Given this, it seems like a reasonable restriction to disallow addressing temporaries
guess you need help from More effective Cpp ;-)
Temp obj and rvalue
“True temporary objects in C++ are invisible - they don't appear in your source code. They arise whenever a non-heap object is created but not named. Such unnamed objects usually arise in one of two situations: when implicit type conversions are applied to make function calls succeed and when functions return objects.”
And from Primer Plus
lvalue is a data object that can be referenced by address through user (named object). Non-lvalues include literal constants (aside from the quoted strings, which are represented by their addresses), expressions with multiple terms, such as (a + b).
In Go lang, string literal will be converted into StrucType object, which will be a non-addressable temp struct object. In this case, string literal cannot be referenced by address in Go.
Well, the last but not the least, one exception in go, you can take the address of the composite literal. OMG, what a mess.

Resources