Race condition. Cannot figure out why - go

A Race Condition occures when i'am running my code. It is a simple implementation of a concurrent safe storage. The Race Condition disappears when i change the reciever in get() method to (p *storageType). I'm confused. I need someone who could explain to me such a behaivior.
package main
type storageType struct {
fc chan func()
value int
}
func newStorage() *storageType {
p := storageType{
fc: make(chan func()),
}
go p.run()
return &p
}
func (p storageType) run() {
for {
(<-p.fc)()
}
}
func (p *storageType) set(s int) {
p.fc <- func() {
p.value = s
}
}
func (p storageType) get() int {
res := make(chan int)
p.fc <- func() {
res <- p.value
}
return <-res
}
func main() {
storage := newStorage()
for i := 0; i < 1000; i++ {
go storage.set(i)
go storage.get()
}
}

In main() the storage variable is of type *storageType. If storageType.Get() has value receiver, then storage.get() means (*storage).get().
The get() call has storageType as the reciver, so the storage pointer variable has to be dereferenced to make a copy (that will be used as the receiver value). This copying means the value of the pointed storageType struct must be read. But this read is not synchronized with the run() method which reads and writes the struct (its value field).
If you change the receiver of get() to be a pointer (of type *storageType), then the receiver again will be a copy, but this time it will be a copy of the pointer, not the pointed struct. So no unsynchronized read of the struct happens.
See possible duplicate: Why does the method of a struct that does not read/write its contents still cause a race case?

First one: your main function doesn't wait for all goroutines to finish. All goroutines are forced to return when main does.
Look into using a sync.WaitGroup

Related

Add a cache to a go function as if it were a static member

Say I have an expensive function
func veryExpensiveFunction(int) int
and this function gets called a lot for the same number.
Is there a good way to allow this function to store previous results to use if the function gets called again that is perhaps even reusable for veryExpensiveFunction2?
Obviously, it would be possible to add an argument
func veryExpensiveFunctionCached(p int, cache map[int]int) int {
if val, ok := cache[p]; ok {
return val
}
result := veryExpensiveFunction(p)
cache[p] = result
return result
}
But now I have to create the cache somewhere, where I don't care about it. I would rather have it as a "static function member" if this were possible.
What is a good way to simulate a static member cache in go?
You can use closures; and let the closure manage the cache.
func InitExpensiveFuncWithCache() func(p int) int {
var cache = make(map[int]int)
return func(p int) int {
if ret, ok := cache[p]; ok {
fmt.Println("from cache")
return ret
}
// expensive computation
time.Sleep(1 * time.Second)
r := p * 2
cache[p] = r
return r
}
}
func main() {
ExpensiveFuncWithCache := InitExpensiveFuncWithCache()
fmt.Println(ExpensiveFuncWithCache(2))
fmt.Println(ExpensiveFuncWithCache(2))
}
output:
4
from cache
4
veryExpensiveFunctionCached := InitExpensiveFuncWithCache()
and use the wrapped function with your code.
You can try it here.
If you want it to be reusable, change the signature to InitExpensiveFuncWithCache(func(int) int) so it accept a function as a parameter. Wrap it in the closure, replacing the expensive computation part with it.
You need to be careful about synchronization if this cache will be used in http handlers. In Go standard lib, each http request is processed in a dedicated goroutine and at this moment we are at the domain of concurrency and race conditions. I would suggest a RWMutex to ensure data consistency.
As for the cache injection, you may inject it at a function where you create the http handler.
Here it is a prototype
type Cache struct {
store map[int]int
mux sync.RWMutex
}
func NewCache() *Cache {
return &Cache{make(map[int]int), sync.RWMutex{}}
}
func (c *Cache) Set(id, value int) {
c.mux.Lock()
c.store[id] = id
c.mux.Unlock()
}
func (c *Cache) Get(id int) (int, error) {
c.mux.RLock()
v, ok := c.store[id]
c.mux.RUnlock()
if !ok {
return -1, errors.New("a value with given key not found")
}
return v, nil
}
func handleComplexOperation(c *Cache) http.HandlerFunc {
return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request){
})
}
The Go standard library uses the following style for providing "static" functions (e.g. flag.CommandLine) but which leverage underlying state:
// "static" function is just a wrapper
func Lookup(p int) int { return expCache.Lookup(p) }
var expCache = NewCache()
func newCache() *CacheExpensive { return &CacheExpensive{cache: make(map[int]int)} }
type CacheExpensive struct {
l sync.RWMutex // lock for concurrent access
cache map[int]int
}
func (c *CacheExpensive) Lookup(p int) int { /*...*/ }
this design pattern not only allows for simple one-time use, but also allows for segregated usage:
var (
userX = NewCache()
userY = NewCache()
)
userX.Lookup(12)
userY.Lookup(42)

Making a struct thread safe using go channels

Suppose I have the following struct:
package manager
type Manager struct {
strings []string
}
func (m *Manager) AddString(s string) {
m.strings = append(m.strings, s)
}
func (m *Manager) RemoveString(s string) {
for i, str := range m.strings {
if str == s {
m.strings = append(m.strings[:i], m.strings[i+1:]...)
}
}
}
This pattern is not thread safe, so the following test fails due to some race condition (array index out of bounds):
func TestManagerConcurrently(t *testing.T) {
m := &manager.Manager{}
wg := sync.WaitGroup{}
for i:=0; i<100; i++ {
wg.Add(1)
go func () {
m.AddString("a")
m.AddString("b")
m.AddString("c")
m.RemoveString("b")
wg.Done()
} ()
}
wg.Wait()
fmt.Println(m)
}
I'm new to Go, and from googling around I suppose I should use channels (?). So one way to make this concurrent would be like this:
type ManagerA struct {
Manager
addStringChan chan string
removeStringChan chan string
}
func NewManagerA() *ManagerA {
ma := &ManagerA{
addStringChan: make(chan string),
removeStringChan: make(chan string),
}
go func () {
for {
select {
case msg := <-ma.addStringChan:
ma.AddString(msg)
case msg := <-ma.removeStringChan:
ma.RemoveString(msg)
}
}
}()
return ma
}
func (m* ManagerA) AddStringA(s string) {
m.addStringChan <- s
}
func (m* ManagerA) RemoveStringA(s string) {
m.removeStringChan <- s
}
I would like to expose an API similar to the non-concurrent example, hence AddStringA, RemoveStringA.
This seems to work as expected concurrently (although I guess the inner goroutine should also exit at some point). My problem with this is that there is a lot of extra boilerplate:
need to define & initialize channels
define inner goroutine loop with select
map functions to channel calls
It seems a bit much to me. Is there a way to simplify this (refactor / syntax / library)?
I think the best way to implement this would be to use a Mutex instead? But is it still possible to simplify this sort of boilerplate?
Using a mutex would be perfectly idiomatic like this:
type Manager struct {
mu sync.Mutex
strings []string
}
func (m *Manager) AddString(s string) {
m.mu.Lock()
m.strings = append(m.strings, s)
m.mu.Unlock()
}
func (m *Manager) RemoveString(s string) {
m.mu.Lock()
for i, str := range m.strings {
if str == s {
m.strings = append(m.strings[:i], m.strings[i+1:]...)
}
}
m.mu.Unlock()
}
You could do this with channels, but as you noted it is a lot of extra work for not much gain. Just use a mutex is my advice!
If you simply need to make the access to the struct thread-safe, use mutex:
type Manager struct {
sync.Mutex
data []string
}
func (m *Manager) AddString(s string) {
m.Lock()
m.strings = append(m.strings, s)
m.Unlock()
}

Golang race with sync.Mutex on map[string]int

I have a simple package I am using to log stats during a program run and I found that go run -race says there is a race condition in it. Looking at the program I'm not sure how I can have a race condition when every read and write is protected by a mutex. Can someone explain this?
package counters
import "sync"
type single struct {
mu sync.Mutex
values map[string]int64
}
// Global counters object
var counters = single{
values: make(map[string]int64),
}
// Get the value of the given counter
func Get(key string) int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values[key]
}
// Incr the value of the given counter name
func Incr(key string) int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
counters.values[key]++ // Race condition
return counters.values[key]
}
// All the counters
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values // running during write above
}
I use the package like so:
counters.Incr("foo")
counters.Get("foo")
A Minimal Complete Verifiable Example would be useful here, but I think your problem is in All():
// All the counters
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
return counters.values // running during write above
}
This returns a map which does not make a copy of it, so it can be accessed outside the protection of the mutex.
All returns the underlying map and the releases the lock, so the code using the map will have a data race.
You should return a copy of the map:
func All() map[string]int64 {
counters.mu.Lock()
defer counters.mu.Unlock()
m := make(map[string]int64)
for k, v := range counters.values {
m[k] = v
}
return m
}
Or not have an All method.

range over an []interfaces{} and get the channel field of each type

I'll try to make it as clear as possible, in my head first.
I have an interface and a couple of Types that inherit it by declaring a method. Pretty nice and clever way of inheritance.
I have then a "super" Type Thing, which all the other Types embed.
The Thing struct has a Size int and an Out chan properties
What I'm trying to understand is why I can get the value of size .GetSize() from both the child structs, but I don't have the same success with the channel field .GetChannel() (*ndr which I'm using to communicate among goroutines and their caller)
...here I get t.GetChannel undefined (type Measurable has no field or method GetChannel)
It might help a demo of the logic:
package main
import (
"fmt"
)
type Measurable interface {
GetSize() int
}
type Thing struct {
Size int
Out chan int
}
type Something struct{ *Thing }
type Otherthing struct{ *Thing }
func newThing(size int) *Thing {
return &Thing{
Size: size,
Out: make(chan int),
}
}
func NewSomething(size int) *Something { return &Something{Thing: newThing(size)} }
func NewOtherthing(size int) *Otherthing { return &Otherthing{Thing: newThing(size)} }
func (s *Thing) GetSize() int { return s.Size }
func (s *Thing) GetChannel() chan int { return s.Out }
func main() {
things := []Measurable{}
pen := NewSomething(7)
paper := NewOtherthing(5)
things = append(things, pen, paper)
for _, t := range things {
fmt.Printf("%T %d \n", t, t.GetSize())
}
for _, t := range things {
fmt.Printf("%+v \n", t.GetChannel())
}
// for _, t := range things {
// fmt.Printf("%+v", t.Thing.Size)
// }
}
The commented code is another thing I'm trying to learn. I can get a value by using a method declared on the super Type, but not by accessing directly from the child one. Sure, I could resolve the type with t.(*bothTheThingTypes).Size but I lose the dinamicity, I'm not fully getting this...
What I'm trying to understand is why I can get the value of size
.GetSize() from both the child structs, but I don't have the same
success with the channel field .GetChannel()
type Measurable interface {
GetSize() int
}
...
things := []Measurable{}
for _, t := range things {
fmt.Printf("%+v \n", t.GetChannel())
}
I may be missing the point but this seems to be caused strictly by the fact that your Measurable interface doesn't have a GetChannel method.

async reply in registry pattern

I'm learning go, and I would like to explore some patterns.
I would like to build a Registry component which maintains a map of some stuff, and I want to provide a serialized access to it:
Currently I ended up with something like this:
type JobRegistry struct {
submission chan JobRegistrySubmitRequest
listing chan JobRegistryListRequest
}
type JobRegistrySubmitRequest struct {
request JobSubmissionRequest
response chan Job
}
type JobRegistryListRequest struct {
response chan []Job
}
func NewJobRegistry() (this *JobRegistry) {
this = &JobRegistry{make(chan JobRegistrySubmitRequest, 10), make(chan JobRegistryListRequest, 10)}
go func() {
jobMap := make(map[string] Job)
for {
select {
case sub := <- this.submission:
job := MakeJob(sub.request) // ....
jobMap[job.Id] = job
sub.response <- job.Id
case list := <- this.listing:
res := make([]Job, 0, 100)
for _, v := range jobMap {
res = append(res, v)
}
list.response <- res
}
/// case somechannel....
}
}()
return
}
Basically, I encapsulate each operation inside a struct, which carries
the parameters and a response channel.
Then I created helper methods for end users:
func (this *JobRegistry) List() ([]Job, os.Error) {
res := make(chan []Job, 1)
req := JobRegistryListRequest{res}
this.listing <- req
return <-res, nil // todo: handle errors like timeouts
}
I decided to use a channel for each type of request in order to be type safe.
The problem I see with this approach are:
A lot of boilerplate code and a lot of places to modify when some param/return type changes
Have to do weird things like create yet another wrapper struct in order to return errors from within the handler goroutine. (If I understood correctly there are no tuples, and no way to send multiple values in a channel, like multi-valued returns)
So, I'm wondering whether all this makes sense, or rather just get back to good old locks.
I'm sure that somebody will find some clever way out using channels.
I'm not entirely sure I understand you, but I'll try answering never the less.
You want a generic service that executes jobs sent to it. You also might want the jobs to be serializable.
What we need is an interface that would define a generic job.
type Job interface {
Run()
Serialize(io.Writer)
}
func ReadJob(r io.Reader) {...}
type JobManager struct {
jobs map[int] Job
jobs_c chan Job
}
func NewJobManager (mgr *JobManager) {
mgr := &JobManager{make(map[int]Job),make(chan Job,JOB_QUEUE_SIZE)}
for {
j,ok := <- jobs_c
if !ok {break}
go j.Run()
}
}
type IntJob struct{...}
func (job *IntJob) GetOutChan() chan int {...}
func (job *IntJob) Run() {...}
func (job *IntJob) Serialize(o io.Writer) {...}
Much less code, and roughly as useful.
About signaling errors with an axillary stream, you can always use a helper function.
type IntChanWithErr struct {
c chan int
errc chan os.Error
}
func (ch *IntChanWithErr) Next() (v int,err os.Error) {
select {
case v := <- ch.c // not handling closed channel
case err := <- ch.errc
}
return
}

Resources