Why the mutex code stops another whole go-routine?

Why the mutex code stops another whole go-routine? - go

var m *sync.RWMutex
func main() {
m = new(sync.RWMutex)
n := 100
go func() {
for i := 0; i < n; i++ {
write("WA", i)
}
}()
go func() {
for i := 0; i < n; i++ {
write("WB", i)
}
}()
select {}
}
func write(tag string, i int) {
m.Lock()
fmt.Printf("[%s][%s%d]write start \n", tag, tag, i)
time.Sleep(100 * time.Millisecond)
fmt.Printf("[%s][%s%d]write end \n", tag, tag, i)
m.Unlock()
// time.Sleep(1 * time.Millisecond)
}
Result in console:
go run mutex.go
[WB][WB0]write start
[WB][WB0]write end
[WB][WB1]write start
[WB][WB1]write end
[WB][WB2]write start
[WB][WB2]write end
[WB][WB3]write start
[WB][WB3]write end
[WB][WB4]write start
[WB][WB4]write end
[WB][WB5]write start
[WB][WB5]write end
[WB][WB6]write start
[WB][WB6]write end
[WB][WB7]write start
[WB][WB7]write end
[WB][WB8]write start
[WB][WB8]write end
[WB][WB9]write start
[WB][WB9]write end
...
> go version
go version go1.5.2 windows/amd64
The question is:
why there is no chance for the go-routine of "[WA]"?
Why the mutex code stops another whole go-routine?
I know there must be a story or a theory about it.
Please give me a url to read and study.

Go uses cooperative multitasking; it doesn't use preemptive mutitasking: Computer multitasking. You need to give the scheduler an opportunity to run between locks. For example, by a call to Gosched(),
package main
import (
"fmt"
"runtime"
"sync"
"time"
)
var m *sync.RWMutex
func main() {
m = new(sync.RWMutex)
n := 100
go func() {
for i := 0; i < n; i++ {
write("WA", i)
}
}()
go func() {
for i := 0; i < n; i++ {
write("WB", i)
}
}()
select {}
}
func write(tag string, i int) {
m.Lock()
fmt.Printf("[%s][%s%d]write start \n", tag, tag, i)
time.Sleep(100 * time.Millisecond)
fmt.Printf("[%s][%s%d]write end \n", tag, tag, i)
m.Unlock()
runtime.Gosched()
}
Output:
[WB][WB0]write start
[WB][WB0]write end
[WA][WA0]write start
[WA][WA0]write end
[WB][WB1]write start
[WB][WB1]write end
[WA][WA1]write start
[WA][WA1]write end
[WB][WB2]write start
[WB][WB2]write end
[WA][WA2]write start
[WA][WA2]write end
[WB][WB3]write start
[WB][WB3]write end
[WA][WA3]write start
[WA][WA3]write end

This situation is called live lock.
When you call m.Unlock() even though two goroutines (A and B) are waiting for this lock to be released the scheduler is free to wake up any of them to proceed.
It looks like the current implementation of scheduler in Go doesn't switch to goroutine A fast to enough for it to acquire the mutex. Before this happens goroutine B re-acquires the mutex.
As you probably found out if you move time.Sleep call after m.Unlock call both A and B goroutines will be running concurrently.
Hopefully this makes sense.

#peterSO 's answer is correct. Just to elaborate a bit on scheduling, that for loop is a sequential tight loop. Means that the compiled instructions from that loop, would occupy a whole thread to finish. Meanwhile other bits of low level instructions would block unless they have some schedule cycles provided by runtime.Gosched() or sleep, in the middle of the loop. That's why they actually have not a chance to catch on with the sync.Mutex (BTW both should be sync.Mutex at declaration and instantiation):
go func() {
for i := 0; i < n; i++ {
runtime.Gosched()
write("WA", i)
}
}()
go func() {
for i := 0; i < n; i++ {
runtime.Gosched()
write("WB", i)
}
}()
And Go scheduler is not preemptive at instruction level (like Erlang). That's why it's better to orchestrate execution path using channels.
Note: I've learnt this the hard way (not a low-level Go compiler expert). And channels provide that orchestration on Go-Routines (& those extra cycles) in a more clean manner. In other words sync.Mutex should be used just for supervising access to stuff; not for orchestration.

Related

Is there another way to make WaitGroup showed the process?

This is my Snippet Code to run the whole worker
for w := 1; w <= *totalworker; w++ {
wg.Add(1)
go worker(w, jobs, results, dir, &wg)
}
This was my Worker
defer wg.Done()
for j := range jobs {
filename := j[0][4] + ".csv"
fmt.Printf("\nWorker %d starting a job\n", id)
//results <- j //to show The result of jobs, unnecessary
fmt.Printf("\nWorker %d Creating %s.csv\n", id, j[0][4])
CreateFile(dir, &filename, j)
fmt.Printf("\nWorker %d Finished creating %s.csv on %s\n", id, j[0][4], *dir)
fmt.Printf("\nWorker %d finished a job\n", id)
}
}
When i run without WaitGroup it will only create a few of the whole file i needed. but It show the process of it. It show worker1 do job, worker2 do job etc... so until the end of program it will show each of it.
Otherwise, with waitgroup it create the whole file i need. But, its completely do all in one without show the process, show when i run it with WaitGroup it just ended like ... wait where is the whole process xD, it just ended with showing Worker1 do job, worker2 do job etc... at End of program.
Is there any thing i can do with this Waitgroup so it show each of its print?

You need create some channels to listening what previous channel is completed like that, this my example I have 20 routines, they will process some logic at the same time, and return in original order:
package channel
import (
"fmt"
"sync"
"time"
)
func Tests() {
c := make(map[int]chan bool)
var wg sync.WaitGroup
// total go routine
loop := 20
// stop in step
stop := 11
for i := 1; i <= loop; i++ {
// init channel
c[i] = make(chan bool)
}
for i := 1; i <= loop; i++ {
wg.Add(1)
go func(c map[int]chan bool, i int) {
defer wg.Done()
// logic step
fmt.Println("Process Logic step ", i)
if i == 1 {
fmt.Println("Sending message first ", i)
c[i] <- true // send now notify to next step
} else {
select {
case channel := <-c[i-1]:
defer close(c[i-1])
if channel == true {
// send now
fmt.Println("Sending message ", i)
// not sent
if i < loop { // fix deadlock when channel doesnt write
if i == stop && stop > 0 {
c[i] <- false // stop in step i
} else {
c[i] <- true // send now notify to next step
}
}
} else {
// not send
if i < loop { // fix deadlock when channel doesnt write
c[i] <- false
}
}
}
}
}(c, i)
}
wg.Wait()
fmt.Println("go here ")
//time.Sleep(3 * time.Second)go run
fmt.Println("End")
}

Why do goroutines not execute until the for-loop has terminated?

Like most of you, I'm familiar with the fact that Go reuses the iterator variable in a for-loop, so closures for each goroutine will capture the same variable. A typical example of this the following code which will always print the final value of the loop from each goroutine:
func main() {
for i := 0; i < 5; i++ {
go func() {
fmt.Println(i) // prints 5 each time
}()
}
time.Sleep(100 * time.Millisecond)
}
What I haven't been able to find an explanation of, is why the goroutines do not execute until after the loop is completed. Even the following code produces the value of ii as 10 which is set after the goroutine is called:
func main() {
for i := 0; i < 5; i++ {
ii := i
go func() {
fmt.Println(ii) // prints 10 each time...!
}()
ii = 10
}
time.Sleep(100 * time.Millisecond)
}
The closest thing to an explanation I've read is that for-loops typically execute faster than goroutines. This raises two questions for me: 1. Is that true? 2. Why?

Never assume the order of execution when dealing with more than one goroutine - no matter how tempting it may be. Don't put in sleeps; don't call runtime.Gosched; don't assume anything.
The only guaranteed way to ensure order of execution is via synchronization methods such as channels, sync.Mutex, sync.WaitGroups etc.

Anything is possible because the program has a data race.
Setting aside the data race, there's no evidence that the goroutines execute after the for loop is completed. The value 10 is assigned to ii in the statement after the goroutine is started, not after the for loop.
The memory model allows for the compiler to optimize the two ii assignments to a single assignment at the start of the for loop body. It could be that the goroutines execute immediately, but the goroutines see the value of 10 from the optimization.

you should use the goroutine like this:
package main
import "fmt"
import "time"
func main() {
for i := 0; i < 5; i++ {
ii := i
go func(k int) {
fmt.Println(k) // prints 10 each time...!
}(ii)
ii = 10
}
time.Sleep(100 * time.Millisecond)
}

goroutine is not be scheduled again

I'm learning golang. I have a goroutine to print variable i and after it I write a deadloop. But when var i up to 491519(or some other value), there is no output on the terminal. It looks like the goroutine which print var i is no longer be scheduled, the CPU execute the deadloop all the way after output 491519. Who can tell me the reason?
thanks.
My code:
package main
import (
"fmt"
"runtime"
)
func main() {
go func() {
i := 1
for {
fmt.Println(i)
i = i + 1
}
}()
runtime.GOMAXPROCS(4)
for {
}
}
I'd like to add that:
When I add fmt.Println("ABC") in the last deadloop, the alternation of ABC or i output on the terminal forever.
my go version: go version go1.9.1 darwin/amd64

Goroutines are scheduled by Go runtime, therefore there are some limitations in comparison to the processes scheduling done by an operating system. An operating system can preempt processes by using timed interrupts, Go runtime cannot preempt goroutines.
Go runtime schedules another goroutine when goroutine
makes channel operation (this includes select statement, even empty)
make system call
explicitly calls runtime.Gosched
Setting GOMAXPROCS does not help much. Take a look at following program. It will use all processors and will stay in tight busy loops.
func runForever() {
for {
}
}
func main() {
for i := 0; i < runtime.GOMAXPROCS(-1); i++ {
go runForever()
}
time.Sleep(time.Second)
}
There are few ways of fixing your program:
go func() {
for i:= 1; true; i++ {
fmt.Println(i)
}
}()
for {
runtime.Gosched()
}
Or
go func() {
for i:= 1; true; i++ {
fmt.Println(i)
}
}()
select {
}
The work on improving the case of tight loops is ongoing.

The dead loop will use a ton of CPU and possibly cause scheduler issues. If you want to block a goroutine, it's much more efficient to read from a channel that's never written:
ch := make(chan struct{})
<-ch
Or better still, set up a channel to wait for a signal to close the application:
stop := make(chan os.Signal, 1)
signal.Notify(stop, os.Interrupt)
<-stop
Also there should be no need to set GOMAXPROCS.

Undetected "deadlock" while reading from channel

How do I deal with a situation where undetected deadlock occurs when reading results of execution of uncertain number tasks from a channel in a complex program, e.g. web server?
package main
import (
"fmt"
"math/rand"
"time"
)
func main() {
rand.Seed(time.Now().UTC().UnixNano())
results := make(chan int, 100)
// we can't know how many tasks there will be
for i := 0; i < rand.Intn(1<<8)+1<<8; i++ {
go func(i int) {
time.Sleep(time.Second)
results <- i
}(i)
}
// can't close channel here
// because it is still written in
//close(results)
// something else is going on other threads (think web server)
// therefore a deadlock won't be detected
go func() {
for {
time.Sleep(time.Second)
}
}()
for j := range results {
fmt.Println(j)
// we just stuck in here
}
}
In case of simpler programs go detects a deadlock and properly fails. Most examples either fetch a known number of results, or write to the channel sequentially.

The trick is to use sync.WaitGroup and wait for the tasks to finish in a non-blocking way.
var wg sync.WaitGroup
// we can't know how many tasks there will be
for i := 0; i < rand.Intn(1<<8)+1<<8; i++ {
wg.Add(1)
go func(i int) {
time.Sleep(time.Second)
results <- i
wg.Done()
}(i)
}
// wait for all tasks to finish in other thread
go func() {
wg.Wait()
close(results)
}()
// execution continues here so you can print results
See also: Go Concurrency Patterns: Pipelines and cancellation - The Go Blog

Always have x number of goroutines running at any time

I see lots of tutorials and examples on how to make Go wait for x number of goroutines to finish, but what I'm trying to do is have ensure there are always x number running, so a new goroutine is launched as soon as one ends.
Specifically I have a few hundred thousand 'things to do' which is processing some stuff that is coming out of MySQL. So it works like this:
db, err := sql.Open("mysql", connection_string)
checkErr(err)
defer db.Close()
rows,err := db.Query(`SELECT id FROM table`)
checkErr(err)
defer rows.Close()
var id uint
for rows.Next() {
err := rows.Scan(&id)
checkErr(err)
go processTheThing(id)
}
checkErr(err)
rows.Close()
Currently that will launch several hundred thousand threads of processTheThing(). What I need is that a maximum of x number (we'll call it 20) goroutines are launched. So it starts by launching 20 for the first 20 rows, and from then on it will launch a new goroutine for the next id the moment that one of the current goroutines has finished. So at any point in time there are always 20 running.
I'm sure this is quite simple/standard, but I can't seem to find a good explanation on any of the tutorials or examples or how this is done.

You may find Go Concurrency Patterns article interesting, especially Bounded parallelism section, it explains the exact pattern you need.
You can use channel of empty structs as a limiting guard to control number of concurrent worker goroutines:
package main
import "fmt"
func main() {
maxGoroutines := 10
guard := make(chan struct{}, maxGoroutines)
for i := 0; i < 30; i++ {
guard <- struct{}{} // would block if guard channel is already filled
go func(n int) {
worker(n)
<-guard
}(i)
}
}
func worker(i int) { fmt.Println("doing work on", i) }

Here I think something simple like this will work :
package main
import "fmt"
const MAX = 20
func main() {
sem := make(chan int, MAX)
for {
sem <- 1 // will block if there is MAX ints in sem
go func() {
fmt.Println("hello again, world")
<-sem // removes an int from sem, allowing another to proceed
}()
}
}

Thanks to everyone for helping me out with this. However, I don't feel that anyone really provided something that both worked and was simple/understandable, although you did all help me understand the technique.
What I have done in the end is I think much more understandable and practical as an answer to my specific question, so I will post it here in case anyone else has the same question.
Somehow this ended up looking a lot like what OneOfOne posted, which is great because now I understand that. But OneOfOne's code I found very difficult to understand at first because of the passing functions to functions made it quite confusing to understand what bit was for what. I think this way makes a lot more sense:
package main
import (
"fmt"
"sync"
)
const xthreads = 5 // Total number of threads to use, excluding the main() thread
func doSomething(a int) {
fmt.Println("My job is",a)
return
}
func main() {
var ch = make(chan int, 50) // This number 50 can be anything as long as it's larger than xthreads
var wg sync.WaitGroup
// This starts xthreads number of goroutines that wait for something to do
wg.Add(xthreads)
for i:=0; i<xthreads; i++ {
go func() {
for {
a, ok := <-ch
if !ok { // if there is nothing to do and the channel has been closed then end the goroutine
wg.Done()
return
}
doSomething(a) // do the thing
}
}()
}
// Now the jobs can be added to the channel, which is used as a queue
for i:=0; i<50; i++ {
ch <- i // add i to the queue
}
close(ch) // This tells the goroutines there's nothing else to do
wg.Wait() // Wait for the threads to finish
}

Create channel for passing data to goroutines.
Start 20 goroutines that processes the data from channel in a loop.
Send the data to the channel instead of starting a new goroutine.

Grzegorz Żur's answer is the most efficient way to do it, but for a newcomer it could be hard to implement without reading code, so here's a very simple implementation:
type idProcessor func(id uint)
func SpawnStuff(limit uint, proc idProcessor) chan<- uint {
ch := make(chan uint)
for i := uint(0); i < limit; i++ {
go func() {
for {
id, ok := <-ch
if !ok {
return
}
proc(id)
}
}()
}
return ch
}
func main() {
runtime.GOMAXPROCS(4)
var wg sync.WaitGroup //this is just for the demo, otherwise main will return
fn := func(id uint) {
fmt.Println(id)
wg.Done()
}
wg.Add(1000)
ch := SpawnStuff(10, fn)
for i := uint(0); i < 1000; i++ {
ch <- i
}
close(ch) //should do this to make all the goroutines exit gracefully
wg.Wait()
}
playground

This is a simple producer-consumer problem, which in Go can be easily solved using channels to buffer the paquets.
To put it simple: create a channel that accept your IDs. Run a number of routines which will read from the channel in a loop then process the ID. Then run your loop that will feed IDs to the channel.
Example:
func producer() {
var buffer = make(chan uint)
for i := 0; i < 20; i++ {
go consumer(buffer)
}
for _, id := range IDs {
buffer <- id
}
}
func consumer(buffer chan uint) {
for {
id := <- buffer
// Do your things here
}
}
Things to know:
Unbuffered channels are blocking: if the item wrote into the channel isn't accepted, the routine feeding the item will block until it is
My example lack a closing mechanism: you must find a way to make the producer to wait for all consumers to end their loop before returning. The simplest way to do this is with another channel. I let you think about it.

I've wrote a simple package to handle concurrency for Golang. This package will help you limit the number of goroutines that are allowed to run concurrently:
https://github.com/zenthangplus/goccm
Example:
package main
import (
"fmt"
"goccm"
"time"
)
func main() {
// Limit 3 goroutines to run concurrently.
c := goccm.New(3)
for i := 1; i <= 10; i++ {
// This function have to call before any goroutine
c.Wait()
go func(i int) {
fmt.Printf("Job %d is running\n", i)
time.Sleep(2 * time.Second)
// This function have to when a goroutine has finished
// Or you can use `defer c.Done()` at the top of goroutine.
c.Done()
}(i)
}
// This function have to call to ensure all goroutines have finished
// after close the main program.
c.WaitAllDone()
}

Also can take a look here: https://github.com/LiangfengChen/goutil/blob/main/concurrent.go
The example can refer the test case.
func TestParallelCall(t *testing.T) {
format := "test:%d"
data := make(map[int]bool)
mutex := sync.Mutex{}
val, err := ParallelCall(1000, 10, func(pos int) (interface{}, error) {
mutex.Lock()
defer mutex.Unlock()
data[pos] = true
return pos, errors.New(fmt.Sprintf(format, pos))
})
for i := 0; i < 1000; i++ {
if _, ok := data[i]; !ok {
t.Errorf("TestParallelCall pos not found: %d", i)
}
if val[i] != i {
t.Errorf("TestParallelCall return value is not right (%d,%v)", i, val[i])
}
if err[i].Error() != fmt.Sprintf(format, i) {
t.Errorf("TestParallelCall error msg is not correct (%d,%v)", i, err[i])
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Why the mutex code stops another whole go-routine? - go

Related

Is there another way to make WaitGroup showed the process?

Why do goroutines not execute until the for-loop has terminated?

goroutine is not be scheduled again

Undetected "deadlock" while reading from channel

Always have x number of goroutines running at any time

Categories

Resources