Thread Safe In Value Receiver In Go - go

type MyMap struct {
data map[int]int
}
func (m Mymap)foo(){
//insert or read from m.data
}
...
go func f (m *Mymap){
for {
//insert into m.data
}
}()
...
Var m Mymap
m.foo()
When I call m.foo(), as we know , there is a copy of "m",value copy ,which is done by compiler 。 My question is , is there a race in the procedure? It is some kind of reading data from the var "m", I mean , you may need a read lock in case someone is inserting values into m.data when you are copying something from m.data.
If it is thread-safe , is it guarenteed by compiler?

This is not safe, and there is no implied safe concurrent access in the language. All concurrent data access is unsafe, and needs to be protected with channels or locks.
Because maps internally contain references to the data they contain, even as the outer structure is copied the map still points to the same data. A concurrent map is often a common requirement, and all you need to do is add a mutex to protect the reads and writes. Though a Mutex pointer would work with your value receiver, it's more idiomatic to use a pointer receiver for mutating methods.
type MyMap struct {
sync.Mutex
data map[int]int
}
func (m *MyMap) foo() {
m.Lock()
defer m.Unlock()
//insert or read from m.data
}
The go memory model is very explicit, and races are generally very easy to reason about. When in doubt, always run your program or tests with -race.

Related

Lock slice before reading and modifying it

My experience working with Go is recent and in reviewing some code, I have seen that while it is write-protected, there is a problem with reading the data. Not with the reading itself, but with possible modifications that can occur between the reading and the modification of the slice.
type ConcurrentSlice struct {
sync.RWMutex
items []Item
}
type Item struct {
Index int
Value Info
}
type Info struct {
Name string
Labels map[string]string
Failure bool
}
As mentioned, the writing is protected in this way:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
for inList := range cs.Iter() {
if item.Name == inList.Value.Name{
cs.items[i] = item
found = true
}
i++
}
if !found {
cs.Lock()
defer cs.Unlock()
cs.items = append(cs.items, item)
}
}
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.Lock()
defer cs.Unlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
But between collecting the content of the slice and modifying it, modifications can occur.It may be that another routine modifies the same slice and when it is time to assign a value, it no longer exists: slice[i] = item
What would be the right way to deal with this?
I have implemented this method:
func GetList() *ConcurrentSlice {
if list == nil {
denylist = NewConcurrentSlice()
return denylist
}
return denylist
}
And I use it like this:
concurrentSlice := GetList()
concurrentSlice.UpdateOrAppend(item)
But I understand that between the get and the modification, even if it is practically immediate, another routine may have modified the slice. What would be the correct way to perform the two operations atomically? That the slice I read is 100% the one I modify. Because if I try to assign an item to a index that no longer exists, it will break the execution.
Thank you in advance!
The way you are doing the blocking is incorrect, because it does not ensure that the items you return have not been removed. In case of an update, the array would still be at least the same length.
A simpler solution that works could be the following:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
cs.Lock()
defer cs.Unlock()
for _, it := range cs.items {
if item.Name == it.Name{
cs.items[i] = it
found = true
}
i++
}
if !found {
cs.items = append(cs.items, item)
}
}
Use a sync.Map if the order of the values is not important.
type Items struct {
m sync.Map
}
func (items *Items) Update(item Info) {
items.m.Store(item.Name, item)
}
func (items *Items) Range(f func(Info) bool) {
items.m.Range(func(key, value any) bool {
return f(value.(Info))
})
}
Data structures 101: always pick the best data structure for your use case. If you’re going to be looking up objects by name, that’s EXACTLY what map is for. If you still need to maintain the order of the items, you use a treemap
Concurrency 101: like transactions, your mutex should be atomic, consistent, and isolated. You’re failing isolation here because the data structure read does not fall inside your mutex lock.
Your code should look something like this:
func {
mutex.lock
defer mutex.unlock
check map or treemap for name
if exists update
else add
}
After some tests, I can say that the situation you fear can indeed happen with sync.RWMutex. I think it could happen with sync.Mutex too, but I can't reproduce that. Maybe I'm missing some informations, or maybe the calls are in order because they all are blocked and the order they redeem the right to lock is ordered in some way.
One way to keep your two calls safe without other routines getting in 'conflict' would be to use an other mutex, for every task on that object. You would lock that mutex before your read and write, and release it when you're done. You would also have to use that mutex on any other call that write (or read) to that object. You can find an implementation of what I'm talking about here in the main.go file. In order to reproduce the issue with RWMutex, you can simply comment the startTask and the endTask calls and the issue is visible in the terminal output.
EDIT : my first answer was wrong as I misinterpreted a test result, and fell in the situation described by OP.
tl;dr;
If ConcurrentSlice is to be used from a single goroutine, the locks are unnecessary, because the way algorithm written there is not going to be any concurrent read/writes to slice elements, or the slice.
If ConcurrentSlice is to be used from multiple goroutines, existings locks are not sufficient. This is because UpdateOrAppend may modify slice elements concurrently.
A safe version woule need two versions of Iter:
This can be called by users of ConcurrentSlice, but it cannot be called from `UpdateOrAppend:
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.RLock()
defer cs.RUnlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
and this is only to be called from UpdateOrAppend:
func (cs *ConcurrentSlice) internalIter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
// No locking
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
And UpdateOrAppend should be synchronized at the top level:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
cs.Lock()
defer cs.Unlock()
....
}
Here's the long version:
This is an interesting piece of code. Based on my understanding of the go memory model, the mutex lock in Iter() is only necessary if there is another goroutine working on this code, and even with that, there is a possible race in the code. However, UpdateOrAppend only modifies elements of the slice with lower indexes than what Iter is working on, so that race never manifests itself.
The race can happen as follows:
The for-loop in iter reads element 0 of the slice
The element is sent through the channel. Thus, the slice receive happens after the first step.
The receiving end potentially updates element 0 of the slice. There is no problem up to here.
Then the sending goroutine reads element 1 of the slice. This is when a race can happen. If step 3 updated index 1 of the slice, the read at step 4 is a race. That is: if step 3 reads the update done by step 4, it is a race. You can see this if you start with i:=1 in UpdateOrAppend, and running it with the -race flag.
But UpdateOrAppend always modifies slice elements that are already seen by Iter when i=0, so this code is safe, even without the lock.
If there will be other goroutines accessing and modifying the structure, you need the Mutex, but you need it to protect the complete UpdateOrAppend method, because only one goroutine should be allowed to run that. You need the mutex to protect the potential updates in the first for-loop, and that mutex has to also include the slice append case, because that may actually modify the slice of the underlying object.
If Iter is only called from UpdateOrAppend, then this single mutex should be sufficient. If however Iter can be called from multiple goroutines, then there is another race possibility. If one UpdateOrAppend is running concurrently with multiple Iter instances, then some of those Iter instances will read from the modified slice elements concurrently, causing a race. So, it should be such that multiple Iters can only run if there are no UpdateOrAppend calls. That is a RWMutex.
But Iter can be called from UpdateOrAppend with a lock, so it cannot really call RLock, otherwise it is a deadlock.
Thus, you need two versions of Iter: one that can be called outside UpdateOrAppend, and that issues RLock in the goroutine, and another that can only be called from UpdateOrAppend and does not call RLock.

Using a mutex within a struct in Go

I see in Essential Go that using a mutex within a struct is not too straight-forward. To quote from the Mutex Gotchas page:
Don’t copy mutexes
A copy of sync.Mutex variable starts with the same state as original
mutex but it is not the same mutex.
It’s almost always a mistake to copy a sync.Mutex e.g. by passing it
to another function or embedding it in a struct and making a copy of
that struct.
If you want to share a mutex variable, pass it as a pointer
*sync.Mutex.
I'm not quite sure I fully understand exactly what's written. I looked here but still wasn't totally clear.
Taking the Essential Go example of a Set, should I be using the mutex like this:
type StringSet struct {
m map[string]struct{}
mu sync.RWMutex
}
or like this?
type StringSet struct {
m map[string]struct{}
mu *sync.RWMutex
}
I tried both with the Delete() function within the example and they both work in the Playground.
// Delete removes a string from the set
func (s *StringSet) Delete(str string) {
s.mu.Lock()
defer s.mu.Unlock()
delete(s.m, str)
}
There will obviously be several instances of a 'Set', and hence each instance should have its own mutex. In such a case, is it preferable to use the mutex or a pointer to the mutex?
Use the first method (a plain Mutex, not a pointer to a mutex), and pass around a *StringSet (pointer to your struct), not a plain StringSet.
In the code you shared in your playground (that version) :
.Add(), .Exists() and .Strings() should acquire the lock,
otherwise your code fits a regular use of structs and mutexes in go.
The "Don't copy mutexes" gotcha would apply if you manipulated plain StringSet structs :
var setA StringSet
setA.Add("foo")
setA.Add("bar")
func buggyFunction(s StringSet) {
...
}
// the gotcha would occur here :
var setB = setA
// or here :
buggyFunction(setA)
In both cases above : you would create a copy of the complete struct
so setB, for example, would manipulate the same underlying map[string]struct{} mapping as setA, but the mutex would not be shared : calling setA.m.Lock() wouldn't prevent modifying the mapping from setB.
If you go the first way, i.e.
type StringSet struct {
m map[string]struct{}
mu sync.RWMutex
}
any accidental or intentional assignment/copy of a struct value will create a new mutex, but the underlying map m will be the same (because maps are essentially pointers). As a result, it will be possible to modify/access the map concurrently without locking. Of course, if you strictly follow a rule "thou shalt not copy a set", that won't happen, but it doesn't make much sense to me.
TL;DR: definitely the second way.

How to implement a singleton

I want to implement a singleton with Go. The difference between normal singleton is the instance is singleton with different key in map struct. Something like this code. I am not sure is there any data race with the demo code.
var instanceLock sync.Mutex
var instances map[string]string
func getDemoInstance(key string) string {
if value, ok := instances[key]; ok {
return value
}
instanceLock.Lock()
defer instanceLock.Unlock()
if value, ok := instances[key]; ok {
return value
} else {
instances[key] = key + key
return key + key
}
}
Yes, there is data race, you can confirm by running it with go run -race main.go. If one goroutine locks and modifies the map, another goroutine may be reading it before the lock.
You may use sync.RWMutex to read-lock when just reading it (multiple readers are allowed without blocking each other).
For example:
var (
instancesMU sync.RWMutex
instances = map[string]string{}
)
func getDemoInstance(key string) string {
instancesMU.RLock()
if value, ok := instances[key]; ok {
instancesMU.RUnlock()
return value
}
instancesMU.RUnlock()
instancesMU.Lock()
defer instancesMU.Unlock()
if value, ok := instances[key]; ok {
return value
}
value := key + key
instances[key] = value
return value
}
You can try this as well: sync.Map
Map is like a Go map[interface{}]interface{} but is safe for concurrent use by multiple goroutines without additional locking or coordination. Loads, stores, and deletes run in amortized constant time.
The Map type is optimized for two common use cases: (1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys.
In these two cases, use of a Map may significantly reduce lock contention compared to a Go map paired with a separate Mutex or RWMutex.
Note: In the third paragraph it mentions why using sync.Map is beneficial rather than using Go Map simply paired up with sync.RWMutex.
So this perfectly fits your case, I guess?
Little late to answer, anyways this should help: https://github.com/ashwinshirva/api/tree/master/dp/singleton
This shows two ways to implement singleton:
Using the sync.Mutex
Using sync.Once

Concurrent read/write of a map var snapshot

I encounter a situation that I can not understand. In my code, I use functions have the need to read a map (but not write, only loop through a snapshot of existing datas in this map). There is my code :
type MyStruct struct {
*sync.RWMutex
MyMap map[int]MyDatas
}
var MapVar = MyStruct{ &sync.RWMutex{}, make(map[int]MyDatas) }
func MyFunc() {
MapVar.Lock()
MapSnapshot := MapVar.MyMap
MapVar.Unlock()
for _, a := range MapSnapshot { // Map concurrent write/read occur here
//Some stuff
}
}
main() {
go MyFunc()
}
The function "MyFunc" is run in a go routine, only once, there is no multiple runs of this func. Many other functions are accessing to the same "MapVar" with the same method and it randomly produce a "map concurrent write/read". I hope someone will explain to me why my code is wrong.
Thank you for your time.
edit: To clarify, I am just asking why my range MapSnapshot produce a concurrent map write/read. I cant understand how this map can be concurrently used since I save the real global var (MapVar) in a local var (MapSnapshot) using a sync mutex.
edit: Solved. To copy the content of a map in a new variable without using the same reference (and so to avoid map concurrent read/write), I must loop through it and write each index and content to a new map with a for loop.
Thanks xpare and nilsocket.
there is no multiple runs of this func. Many other functions are accessing to the same "MapVar" with the same method and it randomly produce a "map concurrent write/read"
When you pass the value of MapVar.MyMap to MapSnapshot, the Map concurrent write/read will never be occur, because the operation is wrapped with mutex.
But on the loop, the error could happen since practically reading process is happening during loop. So better to wrap the loop with mutex as well.
MapVar.Lock() // lock begin
MapSnapshot := MapVar.MyMap
for _, a := range MapSnapshot {
// Map concurrent write/read occur here
// Some stuff
}
MapVar.Unlock() // lock end
UPDATE 1
Here is my response to your argument below:
This for loop takes a lot of time, there is many stuff in this loop, so locking will slow down other routines
As per your statement The function "MyFunc" is run in a go routine, only once, there is no multiple runs of this func, then I think making the MyFunc to be executed as goroutine is not a good choice.
And to increase the performance, better to make the process inside the loop to be executed in a goroutine.
func MyFunc() {
for _, a := range MapVar.MyMap {
go func(a MyDatas) {
// do stuff here
}(a)
}
}
main() {
MyFunc() // remove the go keyword
}
UPDATE 2
If you really want to copy the MapVar.MyMap into another object, passing it to another variable will not solve that (map is different type compared to int, float32 or other primitive type).
Please refer to this thread How to copy a map?

Settings and accessing a pointer from concurrent goroutines

I have a map which is used by goroutine A and replaced once in a time in goroutine B. By replacement I mean:
var a map[T]N
// uses the map
func goroutineA() {
for (...) {
tempA = a
..uses tempA in some way...
}
}
//refreshes the map
func gorountineB() {
for (...) {
time.Sleep(10 * time.Seconds)
otherTempA = make(map[T]N)
...initializes other tempA....
a = otherTempA
}
}
Do you see any problem in this pseudo code? (in terms of concurrecy)
The code isn't safe, since assignments and reads to a pointer value are not guaranteed to be atomic. This can mean that as one goroutine writes the new pointer value, the other may see a mix of bytes from the old and new value, which will cause your program to die in a nasty way. Another thing that may happen is that since there's no synchronisation in your code, the compiler may notice that nothing can change a in goroutineA, and lift the tempA := a statement out of the loop. This will mean that you'll never see new map assignments as the other goroutine updates them.
You can use go test -race to find these sorts of problems automatically.
One solution is to lock all access to the map with a mutex.
You may wish to read the Go Memory Model document, which explains clearly when changes to variables are visible inside goroutines.
When unsure about data races, run go run -race file.go, that being said, yes there will be a race.
The easiest way to fix that is using a sync.RWMutex :
var a map[T]N
var lk sync.RWMutex
// uses the map
func goroutineA() {
for (...) {
lk.RLock()
//actions on a
lk.RUnlock()
}
}
//refreshes the map
func gorountineB() {
for (...) {
otherTempA = make(map[T]N)
//...initializes other tempA....
lk.Lock()
a = otherTempA
lk.Unlock()
}
}

Resources