I have a simple package I am using to log stats during a program run and I found that go run -race says there is a race condition in it. Looking at the program I'm not sure how I can have a race condition when every read and write is protected by a mutex. Can someone explain this?
package counters
import "sync"
type single struct {
mu sync.Mutex
values map[string]int64
}
// Global counters object
var counters = single{
values: make(map[string]int64),
}
// Get the value of the given counter
func Get(key string) int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values[key]
}
// Incr the value of the given counter name
func Incr(key string) int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
counters.values[key]++ // Race condition
return counters.values[key]
}
// All the counters
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values // running during write above
}
I use the package like so:
counters.Incr("foo")
counters.Get("foo")
A Minimal Complete Verifiable Example would be useful here, but I think your problem is in All():
// All the counters
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values // running during write above
}
This returns a map which does not make a copy of it, so it can be accessed outside the protection of the mutex.
All returns the underlying map and the releases the lock, so the code using the map will have a data race.
You should return a copy of the map:
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
m := make(map[string]int64)
for k, v := range counters.values {
m[k] = v
}
return m
}
Or not have an All method.
Related
Say I have an expensive function
func veryExpensiveFunction(int) int
and this function gets called a lot for the same number.
Is there a good way to allow this function to store previous results to use if the function gets called again that is perhaps even reusable for veryExpensiveFunction2?
Obviously, it would be possible to add an argument
func veryExpensiveFunctionCached(p int, cache map[int]int) int {
if val, ok := cache[p]; ok {
return val
}
result := veryExpensiveFunction(p)
cache[p] = result
return result
}
But now I have to create the cache somewhere, where I don't care about it. I would rather have it as a "static function member" if this were possible.
What is a good way to simulate a static member cache in go?
You can use closures; and let the closure manage the cache.
func InitExpensiveFuncWithCache() func(p int) int {
var cache = make(map[int]int)
return func(p int) int {
if ret, ok := cache[p]; ok {
fmt.Println("from cache")
return ret
}
// expensive computation
time.Sleep(1 * time.Second)
r := p * 2
cache[p] = r
return r
}
}
func main() {
ExpensiveFuncWithCache := InitExpensiveFuncWithCache()
fmt.Println(ExpensiveFuncWithCache(2))
fmt.Println(ExpensiveFuncWithCache(2))
}
output:
4
from cache
4
veryExpensiveFunctionCached := InitExpensiveFuncWithCache()
and use the wrapped function with your code.
You can try it here.
If you want it to be reusable, change the signature to InitExpensiveFuncWithCache(func(int) int) so it accept a function as a parameter. Wrap it in the closure, replacing the expensive computation part with it.
You need to be careful about synchronization if this cache will be used in http handlers. In Go standard lib, each http request is processed in a dedicated goroutine and at this moment we are at the domain of concurrency and race conditions. I would suggest a RWMutex to ensure data consistency.
As for the cache injection, you may inject it at a function where you create the http handler.
Here it is a prototype
type Cache struct {
store map[int]int
mux sync.RWMutex
}
func NewCache() *Cache {
return &Cache{make(map[int]int), sync.RWMutex{}}
}
func (c *Cache) Set(id, value int) {
c.mux.Lock()
c.store[id] = id
c.mux.Unlock()
}
func (c *Cache) Get(id int) (int, error) {
c.mux.RLock()
v, ok := c.store[id]
c.mux.RUnlock()
if !ok {
return -1, errors.New("a value with given key not found")
}
return v, nil
}
func handleComplexOperation(c *Cache) http.HandlerFunc {
return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request){
})
}
The Go standard library uses the following style for providing "static" functions (e.g. flag.CommandLine) but which leverage underlying state:
// "static" function is just a wrapper
func Lookup(p int) int { return expCache.Lookup(p) }
var expCache = NewCache()
func newCache() *CacheExpensive { return &CacheExpensive{cache: make(map[int]int)} }
type CacheExpensive struct {
l sync.RWMutex // lock for concurrent access
cache map[int]int
}
func (c *CacheExpensive) Lookup(p int) int { /*...*/ }
this design pattern not only allows for simple one-time use, but also allows for segregated usage:
var (
userX = NewCache()
userY = NewCache()
)
userX.Lookup(12)
userY.Lookup(42)
A Race Condition occures when i'am running my code. It is a simple implementation of a concurrent safe storage. The Race Condition disappears when i change the reciever in get() method to (p *storageType). I'm confused. I need someone who could explain to me such a behaivior.
package main
type storageType struct {
fc chan func()
value int
}
func newStorage() *storageType {
p := storageType{
fc: make(chan func()),
}
go p.run()
return &p
}
func (p storageType) run() {
for {
(<-p.fc)()
}
}
func (p *storageType) set(s int) {
p.fc <- func() {
p.value = s
}
}
func (p storageType) get() int {
res := make(chan int)
p.fc <- func() {
res <- p.value
}
return <-res
}
func main() {
storage := newStorage()
for i := 0; i < 1000; i++ {
go storage.set(i)
go storage.get()
}
}
In main() the storage variable is of type *storageType. If storageType.Get() has value receiver, then storage.get() means (*storage).get().
The get() call has storageType as the reciver, so the storage pointer variable has to be dereferenced to make a copy (that will be used as the receiver value). This copying means the value of the pointed storageType struct must be read. But this read is not synchronized with the run() method which reads and writes the struct (its value field).
If you change the receiver of get() to be a pointer (of type *storageType), then the receiver again will be a copy, but this time it will be a copy of the pointer, not the pointed struct. So no unsynchronized read of the struct happens.
See possible duplicate: Why does the method of a struct that does not read/write its contents still cause a race case?
First one: your main function doesn't wait for all goroutines to finish. All goroutines are forced to return when main does.
Look into using a sync.WaitGroup
Suppose I have the following struct:
package manager
type Manager struct {
strings []string
}
func (m *Manager) AddString(s string) {
m.strings = append(m.strings, s)
}
func (m *Manager) RemoveString(s string) {
for i, str := range m.strings {
if str == s {
m.strings = append(m.strings[:i], m.strings[i+1:]...)
}
}
}
This pattern is not thread safe, so the following test fails due to some race condition (array index out of bounds):
func TestManagerConcurrently(t *testing.T) {
m := &manager.Manager{}
wg := sync.WaitGroup{}
for i:=0; i<100; i++ {
wg.Add(1)
go func () {
m.AddString("a")
m.AddString("b")
m.AddString("c")
m.RemoveString("b")
wg.Done()
} ()
}
wg.Wait()
fmt.Println(m)
}
I'm new to Go, and from googling around I suppose I should use channels (?). So one way to make this concurrent would be like this:
type ManagerA struct {
Manager
addStringChan chan string
removeStringChan chan string
}
func NewManagerA() *ManagerA {
ma := &ManagerA{
addStringChan: make(chan string),
removeStringChan: make(chan string),
}
go func () {
for {
select {
case msg := <-ma.addStringChan:
ma.AddString(msg)
case msg := <-ma.removeStringChan:
ma.RemoveString(msg)
}
}
}()
return ma
}
func (m* ManagerA) AddStringA(s string) {
m.addStringChan <- s
}
func (m* ManagerA) RemoveStringA(s string) {
m.removeStringChan <- s
}
I would like to expose an API similar to the non-concurrent example, hence AddStringA, RemoveStringA.
This seems to work as expected concurrently (although I guess the inner goroutine should also exit at some point). My problem with this is that there is a lot of extra boilerplate:
need to define & initialize channels
define inner goroutine loop with select
map functions to channel calls
It seems a bit much to me. Is there a way to simplify this (refactor / syntax / library)?
I think the best way to implement this would be to use a Mutex instead? But is it still possible to simplify this sort of boilerplate?
Using a mutex would be perfectly idiomatic like this:
type Manager struct {
mu sync.Mutex
strings []string
}
func (m *Manager) AddString(s string) {
m.mu.Lock()
m.strings = append(m.strings, s)
m.mu.Unlock()
}
func (m *Manager) RemoveString(s string) {
m.mu.Lock()
for i, str := range m.strings {
if str == s {
m.strings = append(m.strings[:i], m.strings[i+1:]...)
}
}
m.mu.Unlock()
}
You could do this with channels, but as you noted it is a lot of extra work for not much gain. Just use a mutex is my advice!
If you simply need to make the access to the struct thread-safe, use mutex:
type Manager struct {
sync.Mutex
data []string
}
func (m *Manager) AddString(s string) {
m.Lock()
m.strings = append(m.strings, s)
m.Unlock()
}
I would like to pass a function pointer to a function to "anything".
It's easy to print something that gets passed in from just about anything (as in https://play.golang.org/p/gmOy6JWxGm0):
func printStuff(stuff interface{}) {
fmt.Printf("Testing : %v", stuff)
}
Let's say, though, that I want to do this:
Have multiple structs
Have data loaded from various functions
Have a generic print that calls the function for me
I tried this in a Play (https://play.golang.org/p/l3-OkL6tsMW) and I get the following errors:
./prog.go:35:12: cannot use getStuff1 (type func() SomeObject) as type FuncType in argument to printStuff
./prog.go:36:12: cannot use getStuff2 (type func() SomeOtherObject) as type FuncType in argument to printStuff
In case the Play stuff gets deleted, here's the code I'm trying to figure out how to get to work:
package main
import (
"fmt"
)
type SomeObject struct {
Value string
}
type SomeOtherObject struct {
Value string
}
type FuncType func() interface{}
func getStuff1() SomeObject {
return SomeObject{
Value: "Hello, world!",
}
}
func getStuff2() SomeOtherObject {
return SomeOtherObject{
Value: "Another, hello!",
}
}
func printStuff(toCall FuncType) {
stuff := toCall()
fmt.Printf("Testing : %v", stuff)
}
func main() {
printStuff(getStuff1)
printStuff(getStuff2)
}
What is the secret sauce to get this stuff passed in properly?
Larger Goal Explanation
So what I am trying to accomplish here is reduction of boilerplate code that lives inside a gigantic file. Unfortunately I cannot refactor it further at this point due to other restrictions and I was wondering if this were possible at all considering the error messages and what I had read seemed to dictate otherwise.
There's a large amount of copy-and-paste code that looks like this:
func resendContraDevice(trap *TrapLapse, operation *TrapOperation) {
loaded := contra.Load()
err := trap.SnapBack(operation).send(loaded);
// default error handling
// logging
// boilerplate post-process
}
func resendPolicyDevice(trap *TrapLapse, operation *TrapOperation) {
loaded := policy.Load()
err := trap.SnapBack(operation).send(loaded);
// default error handling
// logging
// boilerplate post-process
}
// etc.
In these, the Load() functions all return a different struct type and they are used elsewhere throughout the application.
I want hoping to get something where I could have:
loaded := fn()
err := trap.SnapBack(operation).send(loaded);
// default error handling
// logging
// boilerplate post-process
Signature for send is, which accepts an interface{} argument:
func (s SnapBack) send(data interface{}) error
I don't know if you have control over the return values of contra.Load() and policy.Load(), for instance, so there may be a better approach, but assuming those cannot be modified, this would allow you to eliminate a lot of boilerplate, without any fancy manipulation:
func boilerplate(tram *TrapLapse, operation *TrapOperation, loader func() interface{}) {
loaded := loader()
err := trap.SnapBack(operation).send(loaded);
// default error handling
// logging
// boilerplate post-process
}
func resendContraDevice(trap *TrapLapse, operation *TrapOperation) {
boilerplate(trap, operation, func() interface{} { return contra.Load() })
}
func resendPolicyDevice(trap *TrapLapse, operation *TrapOperation) {
boilerplate(trap, operation, func() interface{} { return policy.Load() })
}
If there's nothing more complex, you can also simplify this even further:
func boilerplate(tram *TrapLapse, operation *TrapOperation, loaded interface{}) {
err := trap.SnapBack(operation).send(loaded);
// default error handling
// logging
// boilerplate post-process
}
func resendContraDevice(trap *TrapLapse, operation *TrapOperation) {
boilerplate(trap, operation, contra.Load())
}
func resendPolicyDevice(trap *TrapLapse, operation *TrapOperation) {
boilerplate(trap, operation, policy.Load())
}
I'm trying to solve WARNING: DATA RACE
here is the code:
package models
import (
"sync"
"time"
)
type Stats struct {
sync.Mutex
request map[int64]int
}
func (s *Stats) PutRequest() {
s.Lock()
s.request[time.Now().Unix()]++
s.Unlock()
}
func (s *Stats) GetRequests() map[int64]int {
s.Lock()
m := s.request
s.Unlock()
return m
}
var Requests = Stats{
sync.Mutex{},
make(map[int64]int),
}
If i change Stats field request into integer then everithing works fine but not with map. How to correctly lock map in Go?
Use sync.RWMutex
func (s *Stats) PutRequest(ut int64) {
s.Lock()
defer s.Unlock()
s.request[ut]++
}
func (s *Stats) GetRequests() map[int64]int {
s.RLock()
defer s.RUnlock()
m := make(map[int64]int, len(s.request))
for k, v := range s.request {
m[k] = v
}
return m
}
The following channel example can be interesting in this case. go by example - stateful goroutines
Anyway you need to copy the map before returning it.
GetRequests returns the reference to the map, so if other code calls the function and do r/w on the returning map without acquiring the lock, then data race is introduced