Why GO panics with 'concurrent map writes' even with locks in place? - go

When trying to use this struct with multiple goroutines sometimes I get one of these errors:
fatal error: concurrent map read and map write
or
concurrent map writes
After reading the this thread I made sure to return a reference in the constructor and pass in a reference to the receivers.
The entirety of the code where this is being used is in this github repo
type concurrentStorage struct {
sync.Mutex
domain string
urls map[url.URL]bool
}
func newConcurrentStorage(d string) *concurrentStorage{
return &concurrentStorage{
domain: d,
urls: map[url.URL]bool{},
}
}
func (c *concurrentStorage) add(u url.URL) (bool) {
c.Lock()
defer c.Unlock()
if _, ok := c.urls[u]; ok{
return false
}
c.urls[u] = true
return true
}

Upon reading the code on Github that you linked to, the crawl() function accepts a concurrentStorage (not a pointer).
For each de-reference (ie: *urlSet) when calling crawl(), you are copying the concurrentStorage struct (including the sync.Mutex) while the map retains the pointer to the original. This means that your mutexes are isolated to each goroutine, while they are sharing the same state.
If you change crawl() to accept a pointer instead, and stop de-referencing concurrentStorage, it will work as you intend.

Related

Lock slice before reading and modifying it

My experience working with Go is recent and in reviewing some code, I have seen that while it is write-protected, there is a problem with reading the data. Not with the reading itself, but with possible modifications that can occur between the reading and the modification of the slice.
type ConcurrentSlice struct {
sync.RWMutex
items []Item
}
type Item struct {
Index int
Value Info
}
type Info struct {
Name string
Labels map[string]string
Failure bool
}
As mentioned, the writing is protected in this way:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
for inList := range cs.Iter() {
if item.Name == inList.Value.Name{
cs.items[i] = item
found = true
}
i++
}
if !found {
cs.Lock()
defer cs.Unlock()
cs.items = append(cs.items, item)
}
}
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.Lock()
defer cs.Unlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
But between collecting the content of the slice and modifying it, modifications can occur.It may be that another routine modifies the same slice and when it is time to assign a value, it no longer exists: slice[i] = item
What would be the right way to deal with this?
I have implemented this method:
func GetList() *ConcurrentSlice {
if list == nil {
denylist = NewConcurrentSlice()
return denylist
}
return denylist
}
And I use it like this:
concurrentSlice := GetList()
concurrentSlice.UpdateOrAppend(item)
But I understand that between the get and the modification, even if it is practically immediate, another routine may have modified the slice. What would be the correct way to perform the two operations atomically? That the slice I read is 100% the one I modify. Because if I try to assign an item to a index that no longer exists, it will break the execution.
Thank you in advance!
The way you are doing the blocking is incorrect, because it does not ensure that the items you return have not been removed. In case of an update, the array would still be at least the same length.
A simpler solution that works could be the following:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
found := false
i := 0
cs.Lock()
defer cs.Unlock()
for _, it := range cs.items {
if item.Name == it.Name{
cs.items[i] = it
found = true
}
i++
}
if !found {
cs.items = append(cs.items, item)
}
}
Use a sync.Map if the order of the values is not important.
type Items struct {
m sync.Map
}
func (items *Items) Update(item Info) {
items.m.Store(item.Name, item)
}
func (items *Items) Range(f func(Info) bool) {
items.m.Range(func(key, value any) bool {
return f(value.(Info))
})
}
Data structures 101: always pick the best data structure for your use case. If you’re going to be looking up objects by name, that’s EXACTLY what map is for. If you still need to maintain the order of the items, you use a treemap
Concurrency 101: like transactions, your mutex should be atomic, consistent, and isolated. You’re failing isolation here because the data structure read does not fall inside your mutex lock.
Your code should look something like this:
func {
mutex.lock
defer mutex.unlock
check map or treemap for name
if exists update
else add
}
After some tests, I can say that the situation you fear can indeed happen with sync.RWMutex. I think it could happen with sync.Mutex too, but I can't reproduce that. Maybe I'm missing some informations, or maybe the calls are in order because they all are blocked and the order they redeem the right to lock is ordered in some way.
One way to keep your two calls safe without other routines getting in 'conflict' would be to use an other mutex, for every task on that object. You would lock that mutex before your read and write, and release it when you're done. You would also have to use that mutex on any other call that write (or read) to that object. You can find an implementation of what I'm talking about here in the main.go file. In order to reproduce the issue with RWMutex, you can simply comment the startTask and the endTask calls and the issue is visible in the terminal output.
EDIT : my first answer was wrong as I misinterpreted a test result, and fell in the situation described by OP.
tl;dr;
If ConcurrentSlice is to be used from a single goroutine, the locks are unnecessary, because the way algorithm written there is not going to be any concurrent read/writes to slice elements, or the slice.
If ConcurrentSlice is to be used from multiple goroutines, existings locks are not sufficient. This is because UpdateOrAppend may modify slice elements concurrently.
A safe version woule need two versions of Iter:
This can be called by users of ConcurrentSlice, but it cannot be called from `UpdateOrAppend:
func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
cs.RLock()
defer cs.RUnlock()
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
and this is only to be called from UpdateOrAppend:
func (cs *ConcurrentSlice) internalIter() <-chan ConcurrentSliceItem {
c := make(chan ConcurrentSliceItem)
f := func() {
// No locking
for index, value := range cs.items {
c <- ConcurrentSliceItem{index, value}
}
close(c)
}
go f()
return c
}
And UpdateOrAppend should be synchronized at the top level:
func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
cs.Lock()
defer cs.Unlock()
....
}
Here's the long version:
This is an interesting piece of code. Based on my understanding of the go memory model, the mutex lock in Iter() is only necessary if there is another goroutine working on this code, and even with that, there is a possible race in the code. However, UpdateOrAppend only modifies elements of the slice with lower indexes than what Iter is working on, so that race never manifests itself.
The race can happen as follows:
The for-loop in iter reads element 0 of the slice
The element is sent through the channel. Thus, the slice receive happens after the first step.
The receiving end potentially updates element 0 of the slice. There is no problem up to here.
Then the sending goroutine reads element 1 of the slice. This is when a race can happen. If step 3 updated index 1 of the slice, the read at step 4 is a race. That is: if step 3 reads the update done by step 4, it is a race. You can see this if you start with i:=1 in UpdateOrAppend, and running it with the -race flag.
But UpdateOrAppend always modifies slice elements that are already seen by Iter when i=0, so this code is safe, even without the lock.
If there will be other goroutines accessing and modifying the structure, you need the Mutex, but you need it to protect the complete UpdateOrAppend method, because only one goroutine should be allowed to run that. You need the mutex to protect the potential updates in the first for-loop, and that mutex has to also include the slice append case, because that may actually modify the slice of the underlying object.
If Iter is only called from UpdateOrAppend, then this single mutex should be sufficient. If however Iter can be called from multiple goroutines, then there is another race possibility. If one UpdateOrAppend is running concurrently with multiple Iter instances, then some of those Iter instances will read from the modified slice elements concurrently, causing a race. So, it should be such that multiple Iters can only run if there are no UpdateOrAppend calls. That is a RWMutex.
But Iter can be called from UpdateOrAppend with a lock, so it cannot really call RLock, otherwise it is a deadlock.
Thus, you need two versions of Iter: one that can be called outside UpdateOrAppend, and that issues RLock in the goroutine, and another that can only be called from UpdateOrAppend and does not call RLock.

Calling Functions Inside a "LockOSThread" GoRoutine

I'm writing a package to control a Canon DSLR using their EDSDK DLL from Go.
This is a personal project for a photo booth to use at our wedding at my partners request, which I'll be happy to post on GitHub when complete :).
Looking at the examples of using the SDK elsewhere, it isn't threadsafe and uses thread-local resources, so I'll need to make sure I'm calling it from a single thread during usage. While not ideal, it looks like Go provides a "runtime.LockOSThread" function for doing just that, although this does get called by the core DLL interop code itself, so I'll have to wait and find out if that interferes or not.
I want the rest of the application to be able to call the SDK using a higher level interface without worrying about the threading, so I need a way to pass function call requests to the locked thread/Goroutine to execute there, then pass the results back to the calling function outside of that Goroutine.
So far, I've come up with this working example of using very broad function definitions using []interface{} arrays and passing back and forward via channels. This would take a lot of mangling of input/output data on every call to do type assertions back out of the interface{} array, even if we know what we should expect for each function ahead of time, but it looks like it'll work.
Before I invest a lot of time doing it this way for possibly the worst way to do it - does anyone have any better options?
package edsdk
import (
"fmt"
"runtime"
)
type CanonSDK struct {
FChan chan functionCall
}
type functionCall struct {
Function func([]interface{}) []interface{}
Arguments []interface{}
Return chan []interface{}
}
func NewCanonSDK() (*CanonSDK, error) {
c := &CanonSDK {
FChan: make(chan functionCall),
}
go c.BackgroundThread(c.FChan)
return c, nil
}
func (c *CanonSDK) BackgroundThread(fcalls <-chan functionCall) {
runtime.LockOSThread()
for f := range fcalls {
f.Return <- f.Function(f.Arguments)
}
runtime.UnlockOSThread()
}
func (c *CanonSDK) TestCall() {
ret := make(chan []interface{})
f := functionCall {
Function: c.DoTestCall,
Arguments: []interface{}{},
Return: ret,
}
c.FChan <- f
results := <- ret
close(ret)
fmt.Printf("%#v", results)
}
func (c *CanonSDK) DoTestCall([]interface{}) []interface{} {
return []interface{}{ "Test", nil }
}
For similar embedded projects I've played with, I tend to create a single goroutine worker that listens on a channel to perform all the work over that USB device. And any results sent back out on another channel.
Talk to the device with channels only in Go in a one-way exchange. LIsten for responses from the other channel.
Since USB is serial and polling, I had to setup a dedicated channel with another goroutine that justs picks items off the channel when they were pushed into it from the worker goroutine that just looped.

Function returns lock by value

I have the following structure
type Groups struct {
sync.Mutex
Names []string
}
and the following function
func NewGroups(names ...string) (Groups, error) {
// ...
return groups, nil
}
When I check for semantic errors with go vet, I am getting this warning:
NewGroups returns Lock by value: Groups
As go vet is shouting, it is not good. What problems can this code bring? How can I fix this?
You need to embed the sync.Mutex as a pointer:
type Groups struct {
*sync.Mutex
Names []strng
}
Addressing your comment on your question: In the article http://blog.golang.org/go-maps-in-action notice Gerrand is not returning the struct from a function but is using it right away, that is why he isn't using a pointer. In your case you are returning it, so you need a pointer so as not to make a copy of the Mutex.
Update: As #JimB points out, it may not be prudent to embed a pointer to sync.Mutex, it might be better to return a pointer to the outer struct and continue to embed the sync.Mutex as a value. Consider what you are trying to accomplish in your specific case.
Return a pointer *Groups instead.
Embedding the mutex pointer also works but has two disadvantages that require extra care from your side:
the zero value of the struct would have a nil mutex, so you must explicitly initialize it every time
func main() {
a, _ := NewGroups()
a.Lock() // panic: nil pointer dereference
}
func NewGroups(names ...string) (Groups, error) {
return Groups{/* whoops, mutex zero val is nil */ Names: names}, nil
}
assigning a struct value, or passing it as function arg, makes a copy so you also copy the mutex pointer, which then locks all copies. (This may be a legit use case in some particular circumstances, but most of the time it might not be what you want.)
func main() {
a, _ := NewGroups()
a.Lock()
lockShared(a)
fmt.Println("done")
}
func NewGroups(names ...string) (Groups, error) {
return Groups{Mutex: &sync.Mutex{}, Names: names}, nil
}
func lockShared(g Groups) {
g.Lock() // whoops, deadlock! the mutex pointer is the same
}
Keep your original struct and return pointers. You don't have to explicitly init the embedded mutex, and it's intuitive that the mutex is not shared with copies of your struct.
func NewGroups(names ...string) (*Groups, error) {
// ...
return &Groups{}, nil
}
Playground (with the failing examples): https://play.golang.org/p/CcdZYcrN4lm

Cast boxed struct to boxed pointer - golang

I'm using Protobuf for Golang.
Protobuf generates message types where type pointer implements proto.Message().
e.g.
func (*SomeMessage) Message() {}
The protobuf lib have methods like Marshal(proto.Message)
Now to my actual issue.
message := SomeMessage {}
SendMessage(&message)
func SendMessage(message interface{}) {
switch msg := message.(type) {
case proto.Message:
//send across the wire or whatever
default:
//non proto message, panic or whatever
}
}
The above works fine.
However, If I don't pass the message as a pointer, then the code in SendMessage will not match, as the interface is only implemented on the SomeMessage pointer, not on the value.
What I would like to do is:
message := SomeMessage {}
SendMessage(message) //pass by value
//there are more stuff going on in my real code, but just trying to show the relevant parts
func SendMessage(message interface{}) {
//match both pointer and value as proto.Message
//and then turn the value into a pointer so that
//other funcs or protobuf can consume it
message = MagicallyTurnBoxedValueIntoBoxedStruct(message)
switch msg := message.(type) {
case proto.Message:
//send across the wire or whatever
default:
//non proto message, panic or whatever
}
}
preferably I'd like to be able to pass both as pointer and as value.
The reason why I want to pass by value, is that this can act as a poor mans isolation when passing messages across goroutines/threads etc.
(in lack of immutability)
All of this could probably be avoided if the protobuf generator generated allowed values to be treated as proto.Message() too.
Or if there was some nicer way to do immutable messages.
It's not super important,if its possible, cool, if its not, meh :-)
[EDIT]
If I have the reflect.Type of the message and the reflect.Type of the pointer type of the message.
Is it somehow possible to create an instance of the pointer type pointing to the value using "reflect" ?
Normally, you can't take the address of a value which means you can't simply convert the interface{} to a pointer to satisfy Protobuf's requirement.
That said, you can dynamically create a new pointer then copy the value in to that then pass the newly allocated pointer to protobuf.
Here's an example on Play
The value -> pointer conversion is:
func mkPointer(i interface{}) interface{} {
val := reflect.ValueOf(i)
if val.Kind() == reflect.Ptr {
return i
}
if val.CanAddr() {
return val.Addr().Interface()
}
nv := reflect.New(reflect.TypeOf(i))
nv.Elem().Set(val)
return nv.Interface()
}
We first see if it's a pointer, if so, just return the value.
Then we check to see if it's addressable and return that.
Lastly, we make a new instance of the type and copy the contents to that and return it.
Since this this copies the data, it may not be practical for your purposes. It will all depend on size of message and expected rate of calling with a value (as that will generate more garbage).

Thread Safe In Value Receiver In Go

type MyMap struct {
data map[int]int
}
func (m Mymap)foo(){
//insert or read from m.data
}
...
go func f (m *Mymap){
for {
//insert into m.data
}
}()
...
Var m Mymap
m.foo()
When I call m.foo(), as we know , there is a copy of "m",value copy ,which is done by compiler 。 My question is , is there a race in the procedure? It is some kind of reading data from the var "m", I mean , you may need a read lock in case someone is inserting values into m.data when you are copying something from m.data.
If it is thread-safe , is it guarenteed by compiler?
This is not safe, and there is no implied safe concurrent access in the language. All concurrent data access is unsafe, and needs to be protected with channels or locks.
Because maps internally contain references to the data they contain, even as the outer structure is copied the map still points to the same data. A concurrent map is often a common requirement, and all you need to do is add a mutex to protect the reads and writes. Though a Mutex pointer would work with your value receiver, it's more idiomatic to use a pointer receiver for mutating methods.
type MyMap struct {
sync.Mutex
data map[int]int
}
func (m *MyMap) foo() {
m.Lock()
defer m.Unlock()
//insert or read from m.data
}
The go memory model is very explicit, and races are generally very easy to reason about. When in doubt, always run your program or tests with -race.

Resources