How can I construct a slice out of all of elements consumed from a channel (like Python's list does)? I can use this helper function:
func ToSlice(c chan int) []int {
s := make([]int, 0)
for i := range c {
s = append(s, i)
}
return s
}
but due to the lack of generics, I'll have to write that for every type, won't I? Is there a builtin function that implements this? If not, how can I avoid copying and pasting the above code for every single type I'm using?
If there's only a few instances in your code where the conversion is needed, then there's absolutely nothing wrong with copying the 7 lines of code a few times (or even inlining it where it's used, which reduces it to 4 lines of code and is probably the most readable solution).
If you've really got conversions between lots and lots of types of channels and slices and want something generic, then you can do this with reflection at the cost of ugliness and lack of static typing at the callsite of ChanToSlice.
Here's complete example code for how you can use reflect to solve this problem with a demonstration of it working for an int channel.
package main
import (
"fmt"
"reflect"
)
// ChanToSlice reads all data from ch (which must be a chan), returning a
// slice of the data. If ch is a 'T chan' then the return value is of type
// []T inside the returned interface.
// A typical call would be sl := ChanToSlice(ch).([]int)
func ChanToSlice(ch interface{}) interface{} {
chv := reflect.ValueOf(ch)
slv := reflect.MakeSlice(reflect.SliceOf(reflect.TypeOf(ch).Elem()), 0, 0)
for {
v, ok := chv.Recv()
if !ok {
return slv.Interface()
}
slv = reflect.Append(slv, v)
}
}
func main() {
ch := make(chan int)
go func() {
for i := 0; i < 10; i++ {
ch <- i
}
close(ch)
}()
sl := ChanToSlice(ch).([]int)
fmt.Println(sl)
}
You could make ToSlice() just work on interface{}'s, but the amount of code you save here will likely cost you in complexity elsewhere.
func ToSlice(c chan interface{}) []interface{} {
s := make([]interface{}, 0)
for i := range c {
s = append(s, i)
}
return s
}
Full example at http://play.golang.org/p/wxx-Yf5ESN
That being said: As #Volker said in the comments from the slice (haha) of code you showed it seems like it'd be saner to either process the results in a streaming fashion or "buffer them up" at the generator and just send the slice down the channel.
Related
Like if I have a struct with an array and I want to do something like this
type Paxos struct {
peers []string
}
for _, peer := range px.peers {
\\do stuff
}
My routines/threads will never modify the peers array, just read from it. Peers is an array of server addresses, and servers may fail but that wouldn't affect the peers array (later rpc calls would just fail)
If no writes are involved, concurrent reads are always safe, regardless of the data structure. However, as soon as even a single concurrency-unsafe write to a variable is involved, you need to serialise concurrent access (both writes and reads) to the variable.
Moreover, you can safely write to elements of a slice or an array under the condition that no more than one goroutine write to a given element.
For instance, if you run the following programme with the race detector on, it's likely to report a race condition, because multiple goroutines concurrently modify variable results without precautions:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
var results []int
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results = append(results, square(i))
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}
However, the following programme contains no such no synchronization bug, because each element of the slice is modified by a single goroutine:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
results := make([]int, n)
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results[i] = square(i)
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}
Yes, reads are thread-safe in Go and virtually all other languages. You're just looking up an address in memory and seeing what is there. If nothing is attempting to modify that memory, then you can have as many concurrent reads as you'd like.
I'm writing a concurrency-safe memo:
package mu
import (
"sync"
)
// Func represents a memoizable function, operating on a string key, to use with a Mu
type Func func(key string) interface{}
// Mu is a cache that memoizes results of an expensive computation
//
// It has a traditional implementation using mutexes.
type Mu struct {
// guards done
mu sync.RWMutex
done map[string]chan bool
memo map[string]interface{}
f Func
}
// Get a string key if it exists, otherwise computes the value and caches it.
//
// Returns the value and whether or not the key existed.
func (c *Mu) Get(key string) (interface{}, bool) {
c.mu.RLock()
_, ok := c.done[key]
c.mu.RUnlock()
if ok {
return c.get(key), true
}
c.mu.Lock()
_, ok = c.done[key]
if ok {
c.mu.Unlock()
} else {
c.done[key] = make(chan bool)
c.mu.Unlock()
v := c.f(key)
c.memo[key] = v
close(c.done[key])
}
return c.get(key), ok
}
// get returns the value of key, blocking on an existing computation
func (c *Mu) get(key string) interface{} {
<-c.done[key]
v, _ := c.memo[key]
return v
}
As you can see, there's a mutex guarding the done field, which is used
to signal to other goroutines that a computation for a key is pending or done. This avoids duplicate computations (calls to c.f(key)) for the same key.
My question is around the guarantees of this code; by ensuring that the computing goroutine closes the channel after it writes to c.memo, does this guarantee that other goroutines that access c.memo[key] after a blocking call to <-c.done[key] are guaranteed to see the result of the computation?
The short answer is yes.
We can simplify some of the code to get to the essence of why. Consider your Mu struct:
type Mu struct {
memo int
done chan bool
}
We can now define 2 functions, compute and read
func compute(r *Mu) {
time.Sleep(2 * time.Second)
r.memo = 42
close(r.done)
}
func read(r *Mu) {
<-r.done
fmt.Println("Read value: ", r.memo)
}
Here, compute is a computationally heavy task (which we can simulate by sleeping for some time)
Now, in the main function, we start a new compute go routine, along with starting some read go routines at regular intervals:
func main() {
r := &Mu{}
r.done = make(chan bool)
go compute(r)
// this one starts immediately
go read(r)
time.Sleep(time.Second)
// this one starts in the middle of computation
go read(r)
time.Sleep(2*time.Second)
// this one starts after the computation is complete
go read(r)
// This is to prevent the program from terminating immediately
time.Sleep(3 * time.Second)
}
In all three cases, we print out the result of the compute task.
Working code here
When you "close" a channel in go, all statements which wait for the result of the channel (including statements that are executed after it's closed) will block. So provided that the only place that the channel is being closed from is the place where the memo value is computed, you will have that guarantee.
The only place where you should be careful, is to make sure that this channel isn't closed anywhere else in your code.
Here is a simple concurrent map that I wrote for learning purpose
package concurrent_hashmap
import (
"hash/fnv"
"sync"
)
type ConcurrentMap struct {
buckets []ThreadSafeMap
bucketCount uint32
}
type ThreadSafeMap struct {
mapLock sync.RWMutex
hashMap map[string]interface{}
}
func NewConcurrentMap(bucketSize uint32) *ConcurrentMap {
var threadSafeMapInstance ThreadSafeMap
var bucketOfThreadSafeMap []ThreadSafeMap
for i := 0; i <= int(bucketSize); i++ {
threadSafeMapInstance = ThreadSafeMap{sync.RWMutex{}, make(map[string]interface{})}
bucketOfThreadSafeMap = append(bucketOfThreadSafeMap, threadSafeMapInstance)
}
return &ConcurrentMap{bucketOfThreadSafeMap, bucketSize}
}
func (cMap *ConcurrentMap) Put(key string, val interface{}) {
bucketIndex := hash(key) % cMap.bucketCount
bucket := cMap.buckets[bucketIndex]
bucket.mapLock.Lock()
bucket.hashMap[key] = val
bucket.mapLock.Unlock()
}
// Helper
func hash(s string) uint32 {
h := fnv.New32a()
h.Write([]byte(s))
return h.Sum32()
}
I am trying to write a simple benchmark and I find that synchronize access will work correctly but concurrent access will get
fatal error: concurrent map writes
Here is my benchmark run with go test -bench=. -race
package concurrent_hashmap
import (
"testing"
"runtime"
"math/rand"
"strconv"
"sync"
)
// Concurrent does not work
func BenchmarkMyFunc(b *testing.B) {
var wg sync.WaitGroup
runtime.GOMAXPROCS(runtime.NumCPU())
my_map := NewConcurrentMap(uint32(4))
for n := 0; n < b.N; n++ {
go insert(my_map, wg)
}
wg.Wait()
}
func insert(my_map *ConcurrentMap, wg sync.WaitGroup) {
wg.Add(1)
var rand_int int
for element_num := 0; element_num < 1000; element_num++ {
rand_int = rand.Intn(100)
my_map.Put(strconv.Itoa(rand_int), rand_int)
}
defer wg.Done()
}
// This works
func BenchmarkMyFuncSynchronize(b *testing.B) {
my_map := NewConcurrentMap(uint32(4))
for n := 0; n < b.N; n++ {
my_map.Put(strconv.Itoa(123), 123)
}
}
The WARNING: DATA RACE is saying that bucket.hashMap[key] = val is causing the problem, but I am confused on why that is possible, since I lock that logic whenever write is happening.
I think I am missing something basic, can someone point out my mistake?
Thanks
Edit1:
Not sure if this helps but here is what my mutex looks like if I don't lock anything
{{0 0} 0 0 0 0}
Here is what it looks like if I lock the write
{{1 0} 0 0 -1073741824 0}
Not sure why my readerCount is a low negative number
Edit:2
I think I find where the issue is at, but not sure why I have to code that way
The issue is
type ThreadSafeMap struct {
mapLock sync.RWMutex // This is causing problem
hashMap map[string]interface{}
}
it should be
type ThreadSafeMap struct {
mapLock *sync.RWMutex
hashMap map[string]interface{}
}
Another weird thing is that in Put if I put print statement inside lock
bucket.mapLock.Lock()
fmt.Println("start")
fmt.Println(bucket)
fmt.Println(bucketIndex)
fmt.Println(bucket.mapLock)
fmt.Println(&bucket.mapLock)
bucket.hashMap[key] = val
defer bucket.mapLock.Unlock()
The following prints is possible
start
start
{0x4212861c0 map[123:123]}
{0x4212241c0 map[123:123]}
Its weird because each start printout should be follow with 4 lines of bucket info since you cannot have start back to back because that would indicate that multiple thread is access the line inside lock
Also for some reason each bucket.mapLock have different address even if I make the bucketIndex static, that indicate that I am not even accessing the same lock.
But despite the above weirdness changing mutex to pointer solves my problem
I would love to find out why I need pointers for mutex and why the prints seem to indicate multiple thread is accessing the lock and why each lock has different address.
The problem is with the statement
bucket := cMap.buckets[bucketIndex]
bucket now contains copy of the ThreadSafeMap at that index. As sync.RWMutex is stored as value, a copy of it is made while assigning. But map maps hold references to an underlying data structure, so the copy of the pointer or the same map is passed. The code locks a copy of the lock while writing to a single map, which cause the problem.
Thats why you don't face any problem when you change sync.RWMutex to *sync.RWMutex. It's better to store reference to structure in map as shown.
package concurrent_hashmap
import (
"hash/fnv"
"sync"
)
type ConcurrentMap struct {
buckets []*ThreadSafeMap
bucketCount uint32
}
type ThreadSafeMap struct {
mapLock sync.RWMutex
hashMap map[string]interface{}
}
func NewConcurrentMap(bucketSize uint32) *ConcurrentMap {
var threadSafeMapInstance *ThreadSafeMap
var bucketOfThreadSafeMap []*ThreadSafeMap
for i := 0; i <= int(bucketSize); i++ {
threadSafeMapInstance = &ThreadSafeMap{sync.RWMutex{}, make(map[string]interface{})}
bucketOfThreadSafeMap = append(bucketOfThreadSafeMap, threadSafeMapInstance)
}
return &ConcurrentMap{bucketOfThreadSafeMap, bucketSize}
}
func (cMap *ConcurrentMap) Put(key string, val interface{}) {
bucketIndex := hash(key) % cMap.bucketCount
bucket := cMap.buckets[bucketIndex]
bucket.mapLock.Lock()
bucket.hashMap[key] = val
bucket.mapLock.Unlock()
}
// Helper
func hash(s string) uint32 {
h := fnv.New32a()
h.Write([]byte(s))
return h.Sum32()
}
It's possible to validate the scenario by modifying the function Put as follows
func (cMap *ConcurrentMap) Put(key string, val interface{}) {
//fmt.Println("index", key)
bucketIndex := 1
bucket := cMap.buckets[bucketIndex]
fmt.Printf("%p %p\n", &(bucket.mapLock), bucket.hashMap)
}
I don't understand why the deadlock is occurring in this code. I've tried several different things to get the deadlock to stop (several different versions using WorkGroup). This is my first day in Go, and I am pretty disappointed so far with the complexity of fairly simple and straightforward operations. I feel like I'm missing something big and obvious, but all of the docs I have found on this are seemingly very different from what, to me, is a very basic mode of operation. All of the docs use primitive types for channels (int, string) not more complex types, all with very basic for loops OR on they are on the other end of the spectrum where the functions are fairly complicated orchestrations.
I guess I'm really looking for a middle of the road sample of "this is how it's usually done" with goroutines.
package main
import "fmt"
//import "sync"
import "time"
type Item struct {
name string
}
type Truck struct {
Cargo []Item
name string
}
func UnloadTrucks(c chan Truck) {
for t := range c {
fmt.Printf("%s has %d items in cargo: %s\n", t.name, len(t.Cargo), t.Cargo[0].name)
}
}
func main() {
trucks := make([]Truck, 2)
ch := make(chan Truck)
for i, _ := range trucks {
trucks[i].name = fmt.Sprintf("Truck %d", i+1)
fmt.Printf("Building %s\n", trucks[i].name)
}
for t := range trucks {
go func(tr Truck) {
itm := Item{}
itm.name = "Groceries"
fmt.Printf("Loading %s\n", tr.name)
tr.Cargo = append(tr.Cargo, itm)
ch <- tr
}(trucks[t])
}
time.Sleep(50 * time.Millisecond)
fmt.Println("Unloading Trucks")
UnloadTrucks(ch)
fmt.Println("Done")
}
You never close the "truck" channel ch, so UnloadTrucks never returns.
You can close the channel after all workers are done, by using a WaitGroup:
go func() {
wg.Wait()
close(ch)
}()
UnloadTrucks(ch)
http://play.golang.org/p/1V7UbYpsQr
I don't understand why the deadlock is occurring in this code. I've tried several different things to get the deadlock to stop (several different versions using WorkGroup). This is my first day in Go, and I am pretty disappointed so far with the complexity of fairly simple and straightforward operations. I feel like I'm missing something big and obvious, but all of the docs I have found on this are seemingly very different from what, to me, is a very basic mode of operation. All of the docs use primitive types for channels (int, string) not more complex types, all with very basic for loops OR on they are on the other end of the spectrum where the functions are fairly complicated orchestrations.
I guess I'm really looking for a middle of the road sample of "this is how it's usually done" with goroutines.
package main
import "fmt"
//import "sync"
import "time"
type Item struct {
name string
}
type Truck struct {
Cargo []Item
name string
}
func UnloadTrucks(c chan Truck) {
for t := range c {
fmt.Printf("%s has %d items in cargo: %s\n", t.name, len(t.Cargo), t.Cargo[0].name)
}
}
func main() {
trucks := make([]Truck, 2)
ch := make(chan Truck)
for i, _ := range trucks {
trucks[i].name = fmt.Sprintf("Truck %d", i+1)
fmt.Printf("Building %s\n", trucks[i].name)
}
for t := range trucks {
go func(tr Truck) {
itm := Item{}
itm.name = "Groceries"
fmt.Printf("Loading %s\n", tr.name)
tr.Cargo = append(tr.Cargo, itm)
ch <- tr
}(trucks[t])
}
time.Sleep(50 * time.Millisecond)
fmt.Println("Unloading Trucks")
UnloadTrucks(ch)
fmt.Println("Done")
}
You never close the "truck" channel ch, so UnloadTrucks never returns.
You can close the channel after all workers are done, by using a WaitGroup:
go func() {
wg.Wait()
close(ch)
}()
UnloadTrucks(ch)
http://play.golang.org/p/1V7UbYpsQr