I see in Essential Go that using a mutex within a struct is not too straight-forward. To quote from the Mutex Gotchas page:
Don’t copy mutexes
A copy of sync.Mutex variable starts with the same state as original
mutex but it is not the same mutex.
It’s almost always a mistake to copy a sync.Mutex e.g. by passing it
to another function or embedding it in a struct and making a copy of
that struct.
If you want to share a mutex variable, pass it as a pointer
*sync.Mutex.
I'm not quite sure I fully understand exactly what's written. I looked here but still wasn't totally clear.
Taking the Essential Go example of a Set, should I be using the mutex like this:
type StringSet struct {
m map[string]struct{}
mu sync.RWMutex
}
or like this?
type StringSet struct {
m map[string]struct{}
mu *sync.RWMutex
}
I tried both with the Delete() function within the example and they both work in the Playground.
// Delete removes a string from the set
func (s *StringSet) Delete(str string) {
s.mu.Lock()
defer s.mu.Unlock()
delete(s.m, str)
}
There will obviously be several instances of a 'Set', and hence each instance should have its own mutex. In such a case, is it preferable to use the mutex or a pointer to the mutex?
Use the first method (a plain Mutex, not a pointer to a mutex), and pass around a *StringSet (pointer to your struct), not a plain StringSet.
In the code you shared in your playground (that version) :
.Add(), .Exists() and .Strings() should acquire the lock,
otherwise your code fits a regular use of structs and mutexes in go.
The "Don't copy mutexes" gotcha would apply if you manipulated plain StringSet structs :
var setA StringSet
setA.Add("foo")
setA.Add("bar")
func buggyFunction(s StringSet) {
...
}
// the gotcha would occur here :
var setB = setA
// or here :
buggyFunction(setA)
In both cases above : you would create a copy of the complete struct
so setB, for example, would manipulate the same underlying map[string]struct{} mapping as setA, but the mutex would not be shared : calling setA.m.Lock() wouldn't prevent modifying the mapping from setB.
If you go the first way, i.e.
type StringSet struct {
m map[string]struct{}
mu sync.RWMutex
}
any accidental or intentional assignment/copy of a struct value will create a new mutex, but the underlying map m will be the same (because maps are essentially pointers). As a result, it will be possible to modify/access the map concurrently without locking. Of course, if you strictly follow a rule "thou shalt not copy a set", that won't happen, but it doesn't make much sense to me.
TL;DR: definitely the second way.
Related
I'm surprised to see the lack of (in any) modules for a configuration module in Go which is thread safe for concurrent reads/writes.
I'm surprised there is no easy method to basically have something like https://github.com/spf13/viper, but thread safe.. where the Get/Set holds a Lock.
what's the right Go way to handle this without bloating code?
I normally use: https://github.com/spf13/viper however if the program for example as a GUI configuration is changeable during runtime, this package doesn't work.
I started doing the following
var config struct {
lock sync.RWMutex
myString string
myInt int
}
func main() {
config.lock.RLock()
_ = config.myString // any read
config.lock.RUnlock()
}
however this becomes very very tedious when accessing members to each time Lock/Unlock every single access for a read or a write and code becomes bloated and repetitive.
I typically use a configuration struct that is retrieved and updated atomically. This allows multiple fields to be updated in a single "transaction" and is easy to implement/use.
With Go1.18 and earlier this can be implemented with atomic.Value and wrap with some simple helper functions to convert the type from interface{} to your *config.
It is even simpler with atomic.Pointer in Go 1.19:
package main
import (
"sync/atomic"
)
type config struct {
str string
num int
}
var cfg atomic.Pointer[config]
func main() {
// Store initial configuration.
cfg.Store(&config{str: "foo", num: 42})
// Clone and modify multiple fields.
newCfg := *cfg.Load()
newCfg.str = "bar"
newCfg.num++
cfg.Store(&newCfg)
// Retrieve a value.
_ = cfg.Load().str
}
With Go1.18 and earlier you can use atomic.Value and wrap them with some simple helper functions to convert the type from interface{} to your *config.
What's the cleanest way to handle a case such as this:
func a() string {
/* doesn't matter */
}
b *string = &a()
This generates the error:
cannot take the address of a()
My understanding is that Go automatically promotes a local variable to the heap if its address is taken. Here it's clear that the address of the return value is to be taken. What's an idiomatic way to handle this?
The address operator returns a pointer to something having a "home", e.g. a variable. The value of the expression in your code is "homeless". if you really need a *string, you'll have to do it in 2 steps:
tmp := a(); b := &tmp
Note that while there are completely valid use cases for *string, many times it's a mistake to use them. In Go string is a value type, but a cheap one to pass around (a pointer and an int). String's value is immutable, changing a *string changes where the "home" points to, not the string value, so in most cases *string is not needed at all.
See the relevant section of the Go language spec. & can only be used on:
Something that is addressable: variable, pointer indirection, slice indexing operation, field selector of an addressable struct, array indexing operation of an addressable array; OR
A composite literal
What you have is neither of those, so it doesn't work.
I'm not even sure what it would mean even if you could do it. Taking the address of the result of a function call? Usually, you pass a pointer of something to someone because you want them to be able to assign to the thing pointed to, and see the changes in the original variable. But the result of a function call is temporary; nobody else "sees" it unless you assign it to something first.
If the purpose of creating the pointer is to create something with a dynamic lifetime, similar to new() or taking the address of a composite literal, then you can assign the result of the function call to a variable and take the address of that.
In the end you are proposing that Go should allow you to take the address of any expression, for example:
i,j := 1,2
var p *int = &(i+j)
println(*p)
The current Go compiler prints the error: cannot take the address of i + j
In my opinion, allowing the programmer to take the address of any expression:
Doesn't seem to be very useful (that is: it seems to have very small probability of occurrence in actual Go programs).
It would complicate the compiler and the language spec.
It seems counterproductive to complicate the compiler and the spec for little gain.
I recently was tied up in knots about something similar.
First talking about strings in your example is a distraction, use a struct instead, re-writing it to something like:
func a() MyStruct {
/* doesn't matter */
}
var b *MyStruct = &a()
This won't compile because you can't take the address of a(). So do this:
func a() MyStruct {
/* doesn't matter */
}
tmpA := a()
var b *MyStruct = &tmpA
This will compile, but you've returned a MyStruct on the stack, allocated sufficient space on the heap to store a MyStruct, then copied the contents from the stack to the heap. If you want to avoid this, then write it like this:
func a2() *MyStruct {
/* doesn't matter as long as MyStruct is created on the heap (e.g. use 'new') */
}
var a *MyStruct = a2()
Copying is normally inexpensive, but those structs might be big. Even worse when you want to modify the struct and have it 'stick' you can't be copying then modifying the copies.
Anyway, it gets all the more fun when you're using a return type of interface{}. The interface{} can be the struct or a pointer to a struct. The same copying issue comes up.
You can't get the reference of the result directly when assigning to a new variable, but you have idiomatic way to do this without the use of a temporary variable (it's useless) by simply pre-declaring your "b" pointer - this is the real step you missed:
func a() string {
return "doesn't matter"
}
b := new(string) // b is a pointer to a blank string (the "zeroed" value)
*b = a() // b is now a pointer to the result of `a()`
*b is used to dereference the pointer and directly access the memory area which hold your data (on the heap, of course).
Play with the code: https://play.golang.org/p/VDhycPwRjK9
Yeah, it can be annoying when APIs require the use of *string inputs even though you’ll often want to pass literal strings to them.
For this I make a very tiny function:
// Return pointer version of string
func p(s string) *string {
return &s
}
and then instead of trying to call foo("hi") and getting the dreaded cannot use "hi" (type string) as type *string in argument to foo, I just wrap the argument in a call to to p():
foo(p("hi"))
a() doesn't point to a variable as it is on the stack. You can't point to the stack (why would you ?).
You can do that if you want
va := a()
b := &va
But what your really want to achieve is somewhat unclear.
At the time of writing this, none of the answers really explain the rationale for why this is the case.
Consider the following:
func main() {
m := map[int]int{}
val := 1
m[0] = val
v := &m[0] // won't compile, but let's assume it does
delete(m, 0)
fmt.Println(v)
}
If this code snippet actually compiled, what would v point to!? It's a dangling pointer since the underlying object has been deleted.
Given this, it seems like a reasonable restriction to disallow addressing temporaries
guess you need help from More effective Cpp ;-)
Temp obj and rvalue
“True temporary objects in C++ are invisible - they don't appear in your source code. They arise whenever a non-heap object is created but not named. Such unnamed objects usually arise in one of two situations: when implicit type conversions are applied to make function calls succeed and when functions return objects.”
And from Primer Plus
lvalue is a data object that can be referenced by address through user (named object). Non-lvalues include literal constants (aside from the quoted strings, which are represented by their addresses), expressions with multiple terms, such as (a + b).
In Go lang, string literal will be converted into StrucType object, which will be a non-addressable temp struct object. In this case, string literal cannot be referenced by address in Go.
Well, the last but not the least, one exception in go, you can take the address of the composite literal. OMG, what a mess.
What's the difference? Is map[T]bool optimized to map[T]struct{}? Which is the best practice in Go?
Perhaps the best reason to use map[T]struct{} is that you don't have to answer the question "what does it mean if the value is false"?
From "The Go Programming Language":
The struct type with no fields is called the empty struct, written
struct{}. It has size zero and carries no information but may be
useful nonetheless. Some Go programmers use it instead of bool as the
value type of a map that represents a set, to emphasize that only the
keys are significant, but the space saving is marginal and the syntax
more cumbersome, so we generally avoid it.
If you use bool testing for presence in the "set" is slightly nicer since you can just say:
if mySet["something"] {
/* .. */
}
Difference is in memory requirements. Under the bonnet empty struct is not a pointer but a special value to save memory.
An empty struct is a struct type like any other. All the properties you are used to with normal structs apply equally to the empty struct. You can declare an array of structs{}s, but they of course consume no storage.
var x [100]struct{}
fmt.Println(unsafe.Sizeof(x)) // prints 0
If empty structs hold no data, it is not possible to determine if two struct{} values are different.
Considering the above statements it means that we may use them as method receivers.
type S struct{}
func (s *S) addr() { fmt.Printf("%p\n", s) }
func main() {
var a, b S
a.addr() // 0x1beeb0
b.addr() // 0x1beeb0
}
I'm creating a priority queue using Go's heap package. There is an example of one in the documentation.
The queue I'm creating needs to be based around a struct rather than a slice because it requires other properties like a mutex.
type PQueue struct {
queue []*Item
sync.Mutex
}
I implement all the methods that heap.Interface requires.
The issue is that my PQueue.Push method seems not to be permanently adding a value to PQueue.queue.
func (p PQueue) Push(x interface{}) {
p.Lock()
defer p.Unlock()
item := x.(*Item)
item.place = len(p.queue) // the index of an item in the queue
p.queue = append(p.queue, item)
// len(p.queue) does increase
// after the functions exits, the queues length has not increased
}
If I print the length of p.queue at the end of this function, the length has increased. After the functions exits however, it seems the original struct does not get updated.
I think it might be happening because of func (p PQueue) not being a pointer. Why might that be? Is there a way to fix it? If I were to use func (p *PQeueue) Push(x interface{}) instead, I would need to implement my own heap because heap.Interface specifically requires no pointer. Is that my only option?
The problem is that you are appending to a copy of your slice. Thus the change shows within the function, but is lost once you return from the function.
In this blog article from the section Passing slices to functions:
It's important to understand that even though a slice contains a
pointer, it is itself a value. Under the covers, it is a struct value
holding a pointer and a length. It is not a pointer to a struct.
With append you are modifying the slice header. And
Thus if we want to write a function that modifies the header, we must
return it as a result parameter
Or:
Another way to have a function modify the slice header is to pass a
pointer to it.
As a result you need to pass a pointer if you want to modify it with append. Simply change the method to use a pointer receiver. And for that to work you need to call init with a pointer like heap.Init(&pq) as shown in the example that you linked to which does just that and also uses pointer receivers.
From the spec on Method Sets:
The method set of the corresponding pointer type *T is the set of all methods
declared with receiver *T or T (that is, it also contains the method
set of T).
So using a pointer type will work with value and pointer receivers and still implement the interface.
You are right about the problem being related to the receiver of your Push method: the method will receive a copy of the PQueue, so any changes made to the struct will not persist.
Changing the method to use a pointer as a receiver is the correct change, but this also means that PQueue no longer implements heap.Interface. This is due to the fact that Go does not let you take a pointer to the value stored inside an interface variable, so the automatic translation of q.Push() to (&q).Push() does not occur.
This isn't a dead end though, since *PQueue should still implement the heap.Interface. So if you were previously calling heap.Init(q), just change it to heap.Init(&q).
I think it might be happening because of func (p PQueue) not being a pointer
That's right. Quoting Effective Go:
invoking [the method] on a value would cause the method to receive a
copy of the value, so any modifications would be discarded.
You say:
heap.Interface specifically requires no pointer
I'm confused, the example you point to is, in fact, using a pointer:
func (pq *PriorityQueue) Push(x interface{}) {
n := len(*pq)
item := x.(*Item)
item.index = n
*pq = append(*pq, item)
}
Maybe something else is going on?
type MyMap struct {
data map[int]int
}
func (m Mymap)foo(){
//insert or read from m.data
}
...
go func f (m *Mymap){
for {
//insert into m.data
}
}()
...
Var m Mymap
m.foo()
When I call m.foo(), as we know , there is a copy of "m",value copy ,which is done by compiler 。 My question is , is there a race in the procedure? It is some kind of reading data from the var "m", I mean , you may need a read lock in case someone is inserting values into m.data when you are copying something from m.data.
If it is thread-safe , is it guarenteed by compiler?
This is not safe, and there is no implied safe concurrent access in the language. All concurrent data access is unsafe, and needs to be protected with channels or locks.
Because maps internally contain references to the data they contain, even as the outer structure is copied the map still points to the same data. A concurrent map is often a common requirement, and all you need to do is add a mutex to protect the reads and writes. Though a Mutex pointer would work with your value receiver, it's more idiomatic to use a pointer receiver for mutating methods.
type MyMap struct {
sync.Mutex
data map[int]int
}
func (m *MyMap) foo() {
m.Lock()
defer m.Unlock()
//insert or read from m.data
}
The go memory model is very explicit, and races are generally very easy to reason about. When in doubt, always run your program or tests with -race.