Dealing with slices concurrently is not working as expected without mutexes - go

Functions WithMutex and WithoutMutex are giving different results.
WithoutMutex implementation is losing values even though I have Waitgroup set up.
What could be wrong?
Do not run on Playground
P.S. I am on Windows 10 and Go 1.8.1
package main
import (
"fmt"
"sync"
)
var p = fmt.Println
type MuType struct {
list []int
*sync.RWMutex
}
var muData *MuType
var data *NonMuType
type NonMuType struct {
list []int
}
func (data *MuType) add(i int, wg *sync.WaitGroup) {
data.Lock()
defer data.Unlock()
data.list = append(data.list, i)
wg.Done()
}
func (data *MuType) read() []int {
data.RLock()
defer data.RUnlock()
return data.list
}
func (nonmu *NonMuType) add(i int, wg *sync.WaitGroup) {
nonmu.list = append(nonmu.list, i)
wg.Done()
}
func (nonmu *NonMuType) read() []int {
return nonmu.list
}
func WithoutMutex() {
nonmu := &NonMuType{}
nonmu.list = make([]int, 0)
var wg = sync.WaitGroup{}
for i := 0; i < 10; i++ {
wg.Add(1)
go nonmu.add(i, &wg)
}
wg.Wait()
data = nonmu
p(data.read())
}
func WithMutex() {
mtx := &sync.RWMutex{}
withMu := &MuType{list: make([]int, 0)}
withMu.RWMutex = mtx
var wg = sync.WaitGroup{}
for i := 0; i < 10; i++ {
wg.Add(1)
go withMu.add(i, &wg)
}
wg.Wait()
muData = withMu
p(muData.read())
}
func stressTestWOMU(max int) {
p("Without Mutex")
for ii := 0; ii < max; ii++ {
WithoutMutex()
}
}
func stressTest(max int) {
p("With Mutex")
for ii := 0; ii < max; ii++ {
WithMutex()
}
}
func main() {
stressTestWOMU(20)
stressTest(20)
}

Slices are not safe for concurrent writes, so I am in no way surprised that WithoutMutex does not appear to be consistent at all, and has dropped items.
The WithMutex version consistently has 10 items, but in jumbled orders. This is also to be expected, since the mutex protects it so that only one can append at a time. There is no guarantee as to which goroutine will run in which order though, so it is a race to see which of the rapidly spawned goroutines will get to append first.
The waitgroup does not do anything to control access or enforce ordering. It merely provides a signal at the end that everything is done.

Related

How to know the number of specific goroutine?

I'm making a goroutine worker pool that constantly increases and decreases according to the situation.
But never falldown under specific count.
To do this, I want to know the number of specific goroutines. Not use global variable.
package main
import (
"fmt"
"time"
)
func Adaptive_Worker_Pool(value_input chan int) {
kill_sig := make(chan bool)
make_sig := make(chan bool)
for i := 0; i < 5; i++ {
go Do(kill_sig, value_input)
}
go Make_Routine(make_sig, kill_sig, value_input)
go Judge(kill_sig, make_sig, value_input)
}
func Make_Routine(make_sig chan bool, kill_sig chan bool, value_input chan int) {
for {
<-make_sig
go Do(kill_sig, value_input)
}
}
func Do(kill_sig chan bool, value_input chan int) {
outer:
for {
select {
case value := <-value_input:
fmt.Println(value)
case <-kill_sig:
break outer
}
}
}
func Judge(make_sig chan bool, kill_sig chan bool, value_input chan int) {
for {
time.Sleep(time.Millisecond * 500)
count_value_in_channel := len(value_input)
if count_value_in_channel > 5 {
make_sig <- true
} else {
if { // if Count(Do( )) > 5 { continue }
continue // else {kill_sig <- true}
} else { // like this
kill_sig <- true
}
}
}
}
func main() {
value_input := make(chan int, 10)
Adaptive_Worker_Pool(value_input)
a := 0
for {
value_input <- a
a++
}
}
Some value input to a value_input channel and five goroutines that receive and output the value are created by default.
However, if the number of variables in the value_input channel is 5 or more, Do( ) goroutine will made.
I want to make the Judge( ) function decide whether to increment or decrement the Do( ) goroutine, If number of Do( ) goroutine.
How can I do it?
I think you should define a struct like:
type MyChannel struct {
value int
index int
}
and in for loop
value_input := make(chan MyChannel, 10)
a := 0
for {
value_input <- MyChannel{
index: a,
value: a,
}
a++
}
and in some check, you can check the index of MyChannel

How to implement efficient locks on Map of Map

I'm trying to modify values of a map of maps.
Following code shows fatal error: concurrent map read and map write unless I lock whole Map (using defer in method Map.Add in line 17).
My question is, how can I implement it correctly to effectively make use of goroutines concurrency and do not turn my asynchronous code to some synchronous code?
package main
import (
"fmt"
"sync"
"time"
)
type Map struct {
mutex sync.Mutex
item map[int]*Item
}
func (m *Map) Add(key int) {
m.mutex.Lock()
m.item[key] = &Item{value: make(map[int]int)}
defer m.mutex.Unlock()
for j := 0; j < 100; j++ {
go m.item[key].Inc(j)
}
}
type Item struct {
mutex sync.Mutex
value map[int]int
}
func (m *Item) Inc(key int) {
m.mutex.Lock()
m.value[key]++
m.mutex.Unlock()
}
func (m *Item) Value(key int) int {
m.mutex.Lock()
defer m.mutex.Unlock()
return m.value[key]
}
func main() {
m := Map{item: make(map[int]*Item)}
for i := 0; i < 100; i++ {
go m.Add(i)
}
time.Sleep(time.Second)
fmt.Println(m.item[1])
}

Mutex write locking of a channel value

I have a channel of thousands of IDs that need to be processed in parallel inside goroutines. How could I implement a lock so that goroutines cannot process the same id at the same time, should it be repeated in the channel?
package main
import (
"fmt"
"sync"
"strconv"
"time"
)
var wg sync.WaitGroup
func main() {
var data []string
for d := 0; d < 30; d++ {
data = append(data, "id1")
data = append(data, "id2")
data = append(data, "id3")
}
chanData := createChan(data)
for i := 0; i < 10; i++ {
wg.Add(1)
process(chanData, i)
}
wg.Wait()
}
func createChan(data []string) <-chan string {
var out = make(chan string)
go func() {
for _, val := range data {
out <- val
}
close(out)
}()
return out
}
func process(ids <-chan string, i int) {
go func() {
defer wg.Done()
for id := range ids {
fmt.Println(id + " (goroutine " + strconv.Itoa(i) + ")")
time.Sleep(1 * time.Second)
}
}()
}
--edit:
All values need to be processed in any order, but "id1, "id2" & "id3" need to block so they cannot be processed by more than one goroutine at the same time.
The simplest solution here is to not send the duplicate values at all, and then no synchronization is required.
func createChan(data []string) <-chan string {
seen := make(map[string]bool)
var out = make(chan string)
go func() {
for _, val := range data {
if seen[val] {
continue
}
seen[val] = true
out <- val
}
close(out)
}()
return out
}
I've found a solution. Someone has written a package (github.com/EagleChen/mapmutex) to do exactly what I needed:
package main
import (
"fmt"
"github.com/EagleChen/mapmutex"
"strconv"
"sync"
"time"
)
var wg sync.WaitGroup
var mutex *mapmutex.Mutex
func main() {
mutex = mapmutex.NewMapMutex()
var data []string
for d := 0; d < 30; d++ {
data = append(data, "id1")
data = append(data, "id2")
data = append(data, "id3")
}
chanData := createChan(data)
for i := 0; i < 10; i++ {
wg.Add(1)
process(chanData, i)
}
wg.Wait()
}
func createChan(data []string) <-chan string {
var out = make(chan string)
go func() {
for _, val := range data {
out <- val
}
close(out)
}()
return out
}
func process(ids <-chan string, i int) {
go func() {
defer wg.Done()
for id := range ids {
if mutex.TryLock(id) {
fmt.Println(id + " (goroutine " + strconv.Itoa(i) + ")")
time.Sleep(1 * time.Second)
mutex.Unlock(id)
}
}
}()
}
Your problem as stated is difficult by definition and my first choice would be to re-architect the application to avoid it, but if that's not an option:
First, I assume that if a given ID is repeated you still want it processed twice, but not in parallel (if that's not the case and the 2nd instance must be ignored, it becomes even more difficult, because you have to remember every ID you have processed forever, so you don't run the task over it twice).
To achieve your goal, you must keep track of every ID that is being acted upon in a goroutine - a go map is your best option here (note that its size will grow up to as many goroutines as you spin in parallel!). The map itself must be protected by a lock, as it is modified from multiple goroutines.
Another simplification that I'd take is that it is OK for an ID removed from the channel to be added back to it if found to be currently processed by another gorotuine. Then, we need map[string]bool as the tracking device, plus a sync.Mutex to guard it. For simplicity, I assume the map, mutex and the channel are global variables; but that may not be convenient for you - arrange access to those as you see fit (arguments to the goroutine, closure, etc.).
import "sync"
var idmap map[string]bool
var mtx sync.Mutex
var queue chan string
func process_one_id(id string) {
busy := false
mtx.Lock()
if idmap[id] {
busy = true
} else {
idmap[id] = true
}
mtx.Unlock()
if busy { // put the ID back on the queue and exit
queue <- id
return
}
// ensure the 'busy' mark is cleared at the end:
defer func() { mtx.Lock(); delete(idmap, id); mtx.Unlock() }()
// do your processing here
// ....
}

How to properly prevent a data race?

According to the docs doing the following prevents a data race:
var wg sync.WaitGroup
wg.Add(5)
for i := 0; i < 5; i++ {
go func(j int) {
fmt.Println(j) // Good. Read local copy of the loop counter.
wg.Done()
}(i)
}
wg.Wait()
however if I have code such as:
type Job struct {
Work []string
}
var Queue chan Job
func work(id string, type int32) {
job := Job{Work: []string{"A","B","C"}}
go func(j Job, ty int32) {
if ty == 1 {
// do some other task
}
Queue <- j
} (job, type) // RACE ON THIS LINE
if type == 1 {
// do something
}
return
}
work is being called via. gRPC.
the data race detector tells me that the go routine is producing a data race. I am thinking that there is something I am missing here but can't quite seem to find what that is.

How to handle the buffered channel properly in Golang?

I have a channel which stores received data, I want to process it when one of following conditions is met:
1, the channel reaches its capacity.
2, the timer is fired since last process.
I saw the post
Golang - How to know a buffered channel is full
Update:
I inspired from that post and OneOfOne's advice, here is the play :
package main
import (
"fmt"
"math/rand"
"time"
)
var c chan int
var timer *time.Timer
const (
capacity = 5
timerDration = 3
)
func main() {
c = make(chan int, capacity)
timer = time.NewTimer(time.Second * timerDration)
go checkTimer()
go sendRecords("A")
go sendRecords("B")
go sendRecords("C")
time.Sleep(time.Second * 20)
}
func sendRecords(name string) {
for i := 0; i < 20; i++ {
fmt.Println(name+" sending record....", i)
sendOneRecord(i)
interval := time.Duration(rand.Intn(500))
time.Sleep(time.Millisecond * interval)
}
}
func sendOneRecord(record int) {
select {
case c <- record:
default:
fmt.Println("channel is full !!!")
process()
c <- record
timer.Reset(time.Second * timerDration)
}
}
func checkTimer() {
for {
select {
case <-timer.C:
fmt.Println("3s timer ----------")
process()
timer.Reset(time.Second * timerDration)
}
}
}
func process() {
for i := 0; i < capacity; i++ {
fmt.Println("process......", <-c)
}
}
This seems to work fine, but I have a concern, I want to block the channel writing from other goroutine when process() is called, is the code above capable to do so? Or should I add a mutex at the beginning of the process method?
Any elegant solution?
As was mentioned by #OneOfOne, select is really the only way to check if a channel is full.
If you are using the channel to effect batch processing, you could always create an unbuffered channel and have a goroutine pull items and append to a slice.
When the slice reaches a specific size, process the items.
Here's an example on play
package main
import (
"fmt"
"sync"
"time"
)
const BATCH_SIZE = 10
func batchProcessor(ch <-chan int) {
batch := make([]int, 0, BATCH_SIZE)
for i := range ch {
batch = append(batch, i)
if len(batch) == BATCH_SIZE {
fmt.Println("Process batch:", batch)
time.Sleep(time.Second)
batch = batch[:0] // trim back to zero size
}
}
fmt.Println("Process last batch:", batch)
}
func main() {
var wg sync.WaitGroup
ch := make(chan int)
wg.Add(1)
go func() {
batchProcessor(ch)
wg.Done()
}()
fmt.Println("Submitting tasks")
for i := 0; i < 55; i++ {
ch <- i
}
close(ch)
wg.Wait()
}
No, select is the only way to do it:
func (t *T) Send(v *Val) {
select {
case t.ch <- v:
default:
// handle v directly
}
}

Resources