goroutines always execute "last in first out" - go

in the interests of learning more about Go, I have been playing with goroutines, and have noticed something - but am not sure what exactly I'm seeing, and hope someone out there might be able to explain the following behaviour.
the following code does exactly what you'd expect:
package main
import (
"fmt"
)
type Test struct {
me int
}
type Tests []Test
func (test *Test) show() {
fmt.Println(test.me)
}
func main() {
var tests Tests
for i := 0; i < 10; i++ {
test := Test{
me: i,
}
tests = append(tests, test)
}
for _, test := range tests {
test.show()
}
}
and prints 0 - 9, in order.
now, when the code is changed as shown below, it always returns with the last one first - doesn't matter which numbers I use:
package main
import (
"fmt"
"sync"
)
type Test struct {
me int
}
type Tests []Test
func (test *Test) show(wg *sync.WaitGroup) {
fmt.Println(test.me)
wg.Done()
}
func main() {
var tests Tests
for i := 0; i < 10; i++ {
test := Test{
me: i,
}
tests = append(tests, test)
}
var wg sync.WaitGroup
wg.Add(10)
for _, test := range tests {
go func(t Test) {
t.show(&wg)
}(test)
}
wg.Wait()
}
this will return:
9
0
1
2
3
4
5
6
7
8
the order of iteration of the loop isn't changing, so I guess that it is something to do with the goroutines...
basically, I am trying to understand why it behaves like this...I understand that goroutines can run in a different order than the order in which they're spawned, but, my question is why this always runs like this. as if there's something really obvious I'm missing...

As expected, the ouput is pseudo-random,
package main
import (
"fmt"
"runtime"
"sync"
)
type Test struct {
me int
}
type Tests []Test
func (test *Test) show(wg *sync.WaitGroup) {
fmt.Println(test.me)
wg.Done()
}
func main() {
fmt.Println("GOMAXPROCS", runtime.GOMAXPROCS(0))
var tests Tests
for i := 0; i < 10; i++ {
test := Test{
me: i,
}
tests = append(tests, test)
}
var wg sync.WaitGroup
wg.Add(10)
for _, test := range tests {
go func(t Test) {
t.show(&wg)
}(test)
}
wg.Wait()
}
Output:
$ go version
go version devel +af15bee Fri Jan 29 18:29:10 2016 +0000 linux/amd64
$ go run goroutine.go
GOMAXPROCS 4
9
4
5
6
7
8
1
2
3
0
$ go run goroutine.go
GOMAXPROCS 4
9
3
0
1
2
7
4
8
5
6
$ go run goroutine.go
GOMAXPROCS 4
1
9
6
8
4
3
0
5
7
2
$
Are you running in the Go playground? The Go playground, by design, is deterministic, which makes it easier to cache programs.
Or, are you running with runtime.GOMAXPROCS = 1? This runs one thing at a time, sequentially. This is what the Go playground does.

Go routines are scheduled randomly since Go 1.5. So, even if the order looks consistent, don't rely on it.
See Go 1.5 release note :
In Go 1.5, the order in which goroutines are scheduled has been changed. The properties of the scheduler were never defined by the language, but programs that depend on the scheduling order may be broken by this change. We have seen a few (erroneous) programs affected by this change. If you have programs that implicitly depend on the scheduling order, you will need to update them.
Another potentially breaking change is that the runtime now sets the default number of threads to run simultaneously, defined by GOMAXPROCS, to the number of cores available on the CPU. In prior releases the default was 1. Programs that do not expect to run with multiple cores may break inadvertently. They can be updated by removing the restriction or by setting GOMAXPROCS explicitly. For a more detailed discussion of this change, see the design document.

Related

fmt.Println is not executing in order when used without sleep

I'm trying to understand, why my code doesn't behave as I expect it. The problem is that I would expect, that my code would behave like that:
Define channel
Run goroutine and start looping
Put value into channel, print "finished"
Starting second iteration, blocking call(there is already value in the channel), move to main goroutine
Printing 1, trying to run second iteration, blocking call for main goroutine, coming back to second goroutine
Cycle repeats
It works like that, but only with time.Sleep, but for some reason when commenting out time.Sleep it behaves totally different. What's even more interesting that sometimes for really small values of time like Nanos etc this code returns even more different results. Could someone explain me, why it works like that? My guess is that maybe Println is too slow on display, but it sounds weird to me..
Thanks
** As expected: **
finished
1
finished
2
finished
3
finished
6
finished
4
finished
8
finished all
** Not expected **
finished
1
2
finished
finished
3
finished
6
4
finished
finished
8
finished all
func main() {
var c chan int = make(chan int)
go sendingThrowingResults(c)
for val := range c {
fmt.Println(val)
}
fmt.Println("finished all")
}
func sendingThrowingResults(c chan int) {
var results []int = []int{1, 2, 3, 6, 4, 8}
for _, val := range results {
//time.Sleep(100 * time.Millisecond)
c <- val
fmt.Println("finished")
}
defer close(c)
}
A channel operation needs both sides to participate. A write only happens when a reader is ready. Once that happens, there is no guarantee on which goroutine will run first.
Thus, once the channel write happens, one of the two printlns will work, in some random order.

go testing outputs wrong case names in json format under parallel mode

go version: 1.18.1
suppose i wrote this test file parallel_test.go
package parallel_json_output
import (
"fmt"
"testing"
"time"
)
func TestP(t *testing.T) {
t.Run("a", func(t *testing.T) {
t.Parallel()
for i := 0; i < 5; i++ {
time.Sleep(time.Second)
fmt.Println("a", i)
}
})
t.Run("b", func(t *testing.T) {
t.Parallel()
for i := 0; i < 5; i++ {
time.Sleep(time.Second)
fmt.Println("b", i)
}
})
}
after running go test parallel_test.go -v -json, i got
{"Time":"2022-06-11T02:48:10.3262833+08:00","Action":"run","Package":"command-line-arguments","Test":"TestP"}
{"Time":"2022-06-11T02:48:10.3672856+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP","Output":"=== RUN TestP\n"}
{"Time":"2022-06-11T02:48:10.3682857+08:00","Action":"run","Package":"command-line-arguments","Test":"TestP/a"}
{"Time":"2022-06-11T02:48:10.3682857+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/a","Output":"=== RUN TestP/a\n"}
{"Time":"2022-06-11T02:48:10.3692857+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/a","Output":"=== PAUSE TestP/a\n"}
{"Time":"2022-06-11T02:48:10.3702858+08:00","Action":"pause","Package":"command-line-arguments","Test":"TestP/a"}
{"Time":"2022-06-11T02:48:10.3702858+08:00","Action":"run","Package":"command-line-arguments","Test":"TestP/b"}
{"Time":"2022-06-11T02:48:10.3712858+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"=== RUN TestP/b\n"}
{"Time":"2022-06-11T02:48:10.3712858+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"=== PAUSE TestP/b\n"}
{"Time":"2022-06-11T02:48:10.3722859+08:00","Action":"pause","Package":"command-line-arguments","Test":"TestP/b"}
{"Time":"2022-06-11T02:48:10.373286+08:00","Action":"cont","Package":"command-line-arguments","Test":"TestP/a"}
{"Time":"2022-06-11T02:48:10.373286+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/a","Output":"=== CONT TestP/a\n"}
{"Time":"2022-06-11T02:48:10.374286+08:00","Action":"cont","Package":"command-line-arguments","Test":"TestP/b"}
{"Time":"2022-06-11T02:48:10.374286+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"=== CONT TestP/b\n"}
{"Time":"2022-06-11T02:48:11.3352891+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"b 0\n"}
{"Time":"2022-06-11T02:48:11.3352891+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"a 0\n"}
...
look at this line {"Time":"2022-06-11T02:48:11.3352891+08:00","Action":"output","Package":"command-line-arguments","Test":"TestP/b","Output":"a 0\n"}. this output should be printed by case TestP/a instead of b, but the output messed up the case name in parallel tests.
this problem made reporting tool generate wrong HTML report, IDEs (like GoLand) are effected too and cannot sort parallel output correctly.
i found an issue of it in Github here, but this issue seems had been fixed already in go 1.14.6, however, it still appears in go 1.18.
i wonder what happend and how to deal with it, many thanks.
It makes sense that generic fmt package has little knowledge about currently executed tests in concurrent environment.
Testing package has its own Log method that correctly renders current test:
t.Log("a", i)

Synchronize value (counter) across goroutines

I have a golang application that goes through pages of a website, and is supposed to download every link on the website. It looks something a little like this (I don't know the number of pages beforehand, so that is done synchronously):
page := 0
results := getPage(page)
c := make(chan *http.Response)
for len(results) > 0 {
for result := range results {
go myProxySwitcher.downloadChan(result.URL, c)
fmt.Println(myProxySwitcher.counter)
}
page++
results = getPage(page)
myProxySwitcher.counter++
}
The twist is, every 10 requests, I want to change the Proxy I use to connect to the website. To do this, I made a struct with a counter member:
type ProxySwitcher struct {
proxies []string
client *http.Client
counter int
}
And then I have incremented the counter each time a request is made from downloadChan.
func (p *ProxySwitcher) downloadChan(url string, c chan *http.Response) {
p.counter++
proxy := p.proxies[int(p.counter/10)%len(p.proxies]
res := p.client.Get(url, proxy)
c <- res
}
When it does the downloads, it doesn't appear the the counter is synchronized between goroutines. How can I sync the value of the counter between goroutines?
The result I get from those printlns are:
1
1
1
1
1
1
2
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
5
5
5
And I am expecting
1
2
3
4
5
...
You have a race condition in your code.
In the first snippet, you're modifying the counter field from the "main" goroutine:
// ...
myProxySwitcher.counter++
In the third snippet, you're also modifying that counter from a different goroutine:
// ...
p.counter++
This is illegal code in Go. By definition, the results are undefined. To understand why, you'll have to go over the Go Memory Model. Hint: it won't likely be an easy read.
To fix it, you need to ensure synchronization. There are many ways to do it.
One way, as suggested in a comment on your question, to do it is to use a mutex. Below is an example, kinda messy, as it would need some refactoring on the main loop. But this is how you would synchronize access to the counter:
type ProxySwitcher struct {
proxies []string
client *http.Client
mu sync.Mutex
counter int
}
func (p *ProxySwitcher) downloadChan(url string, c chan *http.Response) {
p.mu.Lock()
p.counter++
// gotta read it from p while holding
// the lock to use it below
counter := p.counter
p.mu.Unlock()
// here you use counter rather than p.counter,
// since you don't hold the lock anymore
proxy := p.proxies[int(counter/10)%len(p.proxies)]
res := p.client.Get(url, proxy)
c <- res
}
// ... the loop ...
for len(results) > 0 {
for result := range results {
go myProxySwitcher.downloadChan(result.URL, c)
// this is kinda messy, would need some heavier
// refactoring, but this should fix the race:
myProxySwitcher.mu.Lock()
fmt.Println(myProxySwitcher.counter)
myProxySwitcher.mu.Unlock()
}
page++
results = getPage(page)
// same... it's messy, needs refactoring
myProxySwitcher.mu.Lock()
myProxySwitcher.counter++
myProxySwitcher.mu.Unlock()
}
Alternatively, you could change that counter to e.g. uint64, and then use the atomic/sync package to perform goroutine-safe operations:
type ProxySwitcher struct {
proxies []string
client *http.Client
counter uint64
}
func (p *ProxySwitcher) downloadChan(url string, c chan *http.Response) {
counter := atomic.AddUint64(&p.counter, 1)
// here you use counter rather than p.counter, since that's your local copy
proxy := p.proxies[int(counter/10)%len(p.proxies)]
res := p.client.Get(url, proxy)
c <- res
}
// ... the loop ...
for len(results) > 0 {
for result := range results {
go myProxySwitcher.downloadChan(result.URL, c)
counter := atomic.LoadUint64(&myProxySwitcher.counter)
fmt.Println(counter)
}
page++
results = getPage(page)
atomic.AddUint64(&myProxySwitcher.counter, 1)
}
I'd probably use this last version, as it's cleaner and we don't really need a mutex.

It's concurrent, but what makes it run in parallel?

So I'm trying to understand how parallel computing works while also learning Go. I understand the difference between concurrency and parallelism, however, what I'm a little stuck on is how Go (or the OS) determines that something should be executed in parallel...
Is there something I have to do when writing my code, or is it all handled by the schedulers?
In the example below, I have two functions that are run in separate Go routines using the go keyword. Because the default GOMAXPROCS is the number of processors available on your machine (and I'm also explicitly setting it) I would expect that these two functions run at the same time and thus the output would be a mix of number in particular order - And furthermore that each time it is run the output would be different. However, this is not the case. Instead, they are running one after the other and to make matters more confusing function two is running before function one.
Code:
func main() {
runtime.GOMAXPROCS(6)
var wg sync.WaitGroup
wg.Add(2)
fmt.Println("Starting")
go func() {
defer wg.Done()
for smallNum := 0; smallNum < 20; smallNum++ {
fmt.Printf("%v ", smallNum)
}
}()
go func() {
defer wg.Done()
for bigNum := 100; bigNum > 80; bigNum-- {
fmt.Printf("%v ", bigNum)
}
}()
fmt.Println("Waiting to finish")
wg.Wait()
fmt.Println("\nFinished, Now terminating")
}
Output:
go run main.go
Starting
Waiting to finish
100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Finished, Now terminating
I am following this article, although just about every example I've come across does something similar.
Concurrency, Goroutines and GOMAXPROCS
Is this working the way is should and I'm not understanding something correctly, or is my code not right?
Is there something I have to do when writing my code,
No.
or is it all handled by the schedulers?
Yes.
In the example below, I have two functions that are run in separate Go routines using the go keyword. Because the default GOMAXPROCS is the number of processors available on your machine (and I'm also explicitly setting it) I would expect that these two functions run at the same time
They might or might not, you have no control here.
and thus the output would be a mix of number in particular order - And furthermore that each time it is run the output would be different. However, this is not the case. Instead, they are running one after the other and to make matters more confusing function two is running before function one.
Yes. Again you cannot force parallel computation.
Your test is flawed: You just don't do much in each goroutine. In your example goroutine 2 might be scheduled to run, starts running and completes before goroutine 1 started running. "Starting" a goroutine with go doesn't force it to start executing right away, all there is done is creating a new goroutine which can run. From all goroutines which can run some are scheduled onto your processors. All this scheduling cannot be controlled, it is fully automatic. As you seem to know this is the difference between concurrent and parallel. You have control over concurrency in Go but not (much) on what is done actually in parallel on two or more cores.
More realistic examples with actual, long-running goroutines which do actual work will show interleaved output.
It's all handled by the scheduler.
With only two loops of 20 short instructions, you will be hard pressed to see the effects of concurrency or parallelism.
Here is another toy example : https://play.golang.org/p/xPKITzKACZp
package main
import (
"fmt"
"runtime"
"sync"
"sync/atomic"
"time"
)
const (
ConstMaxProcs = 2
ConstRunners = 4
ConstLoopcount = 1_000_000
)
func runner(id int, wg *sync.WaitGroup, cptr *int64) {
var times int
for i := 0; i < ConstLoopcount; i++ {
val := atomic.AddInt64(cptr, 1)
if val > 1 {
times++
}
atomic.AddInt64(cptr, -1)
}
fmt.Printf("[runner %d] cptr was > 1 on %d occasions\n", id, times)
wg.Done()
}
func main() {
runtime.GOMAXPROCS(ConstMaxProcs)
var cptr int64
wg := &sync.WaitGroup{}
wg.Add(ConstRunners)
start := time.Now()
for id := 1; id <= ConstRunners; id++ {
go runner(id, wg, &cptr)
}
wg.Wait()
fmt.Printf("completed in %s\n", time.Now().Sub(start))
}
As with your example : you don't have control on the scheduler, this example just has more "surface" to witness some effects of concurrency.
It's hard to witness the actual difference between concurrency and parallelism from within the program, you can view your processor's activity while it runs, or check the global execution time.
The playground does not give sub-second precision on its clock, if you want to see the actual timing, copy/paste the code in a local file and tune the constants to see various effects.
Note that some other effects (probably : branch prediction on the if val > 1 {...} check and/or memory invalidation around the shared cptr variable) make the execution very volatile on my machine, so don't expect a straight "running with ConstMaxProcs = 4 is 4 times quicker than ConstMaxProcs = 1".

Using testing.Benchmark does not produce any output

I'm using testing.Benchmark to manually run a couple benchmarks but the result object is always empty.
Am I missing something here?
Here's an example:
package main
import "testing"
func main() {
result := testing.Benchmark(func(parentB *testing.B) {
parentB.Run("example", func(b *testing.B) {
for n := 0; n < b.N; n++ {
println("ok")
}
})
})
println(result.String())
}
This will print ok a couple times and then 0 0 ns/op but the benchmark clearly did run something.
I think you are doing everything right. Doc of testing.Benchmark() says:
Benchmark benchmarks a single function. Useful for creating custom benchmarks that do not use the "go test" command.
If f calls Run, the result will be an estimate of running all its subbenchmarks that don't call Run in sequence in a single benchmark.
Looking into the implementation (Go 1.7.4):
func Benchmark(f func(b *B)) BenchmarkResult {
b := &B{
common: common{
signal: make(chan bool),
w: discard{},
},
benchFunc: f,
benchTime: *benchTime,
}
if !b.run1() {
return BenchmarkResult{}
}
return b.run()
}
This line:
if !b.run1() {
return BenchmarkResult{}
}
b.run1() is supposed to run your passed function once, and detect if it has sub-benchmarks. Yours has. It returns a bool whether more runs are needed. Inside run1():
if b.hasSub || b.finished {
// ...
return true
}
It properly tells it has sub-benchmark, and Benchmark() –with noble simplicity– just returns an empty BenchmarkResult:
if !b.run1() {
return BenchmarkResult{}
}
I do believe that either this is a bug (or rather "incomplete" feature), or doc is incorrect. I suggest to file an issue here: https://github.com/golang/go/issues
Eddited the answer to clarify:
My guess is that you are using go run to run the test. That will not produce any result. In order to run the code exatly as it is written you need to use
go test -bench=. and I think it should work.
The file must be named test_xxx.go where xxx is whatever you want.
If you restucture you code a litle bit it can be run as a single function benchmark:
package main
import "testing"
func main() {
myTest()
}
func myTest() {
fn := func(b *testing.B) {
for n := 0; n < b.N; n++ {
println("ok")
}
}
result := testing.Benchmark(fn)
println(result.String())
}

Resources