goroutines affect each other when printing timecost - go

I'm a freshman for Golang. I know goroutine is an abstract group of cpu and memory to run a piece of code.
So When I run some computing funcs(like sort) in goroutines, I'm hoping they run parallel. But the printed result seems weird, the "paralell" codes print nearly the same timecost.
Why? Is there something I missed about goroutine, or it's because of the func printTime() ?
codes: https://play.golang.org/p/n9DLn57ftM
P.S. codes should be copied to local go file and run. Those run in play.golang has some limitation.
the result is:
MaxProcs: 8
Source : 2.0001ms
Quick sort : 3.0002ms
Merge sort : 8.0004ms
Insertion sort : 929.0532ms
Goroutine num: 1
Source : 2.0001ms
Goroutine num: 4
Insertion sort : 927.0531ms
Quick sort : 930.0532ms
Merge sort : 934.0535ms

You should measure total time cost instead of individual time cost required by each sorting algorithm. The individual time cost might be longer when the task is distributed into several goroutines since it need additional time to setup the goroutine. Depending on the nature of your program, additional time might be needed for communication between goroutine and/or with main process. There're some resources related to goroutine, e.g.
Is a Go goroutine a coroutine?
http://divan.github.io/posts/go_concurrency_visualize/
If you change main function to:
func main() {
fmt.Println("MaxProcs:", runtime.GOMAXPROCS(0)) // 8
start := time.Now()
sequentialTest()
seq := time.Now()
concurrentTest()
con := time.Now()
fmt.Printf("\n\nCalculation time, sequential: %v, concurrent: %v\n",
seq.Sub(start), con.Sub(seq))
}
the output will look like:
MaxProcs: 4
Source : 3.0037ms
Quick sort : 5.0034ms
Merge sort : 13.0069ms
Insertion sort : 1.2590941s
Goroutine num: 1
Source : 3.0015ms
Goroutine num: 4
Insertion sort : 1.2399076s
Quick sort : 1.2459121s
Merge sort : 1.2519156s
Calculation time, sequential: 1.2831112s, concurrent: 1.2559194s
After removing printTime, it looks like:
MaxProcs: 4
Goroutine num: 1
Goroutine num: 4
Calculation time, sequential: 1.3154314s, concurrent: 1.244112s
The time cost value might change slightly, but most of the time the result will be sequential > concurrent. In summary, distributing the task into several goroutines, may increase the overall performance (time cost) but not the individual task.

Sorry, I don't understand what you want to test. At the first, you code doesn't work for quickSort because you run quickSort with go quickSort(...).
wg.Add(1)
go func(){
go quickSort(s1, nil)
wg.Done()
}()
This goroutine will quit immediately.
Now, I tested your code. And I notice that you must get total time between start and all of finish.
https://play.golang.org/p/O8Gj-OYIdR

wg.Add(1)
go func(){
go quickSort(s1, nil)
wg.Done()
}()
You don't need that second go
It should be like
wg.Add(1)
go func(){
quickSort(s1, nil)
wg.Done()
}()
What happened is that go func() started one goroutine, and go quickSort(s1, nil) starts another. As a result, wg.Done() (and, as a result, wg.Done()) gets executed practically right away (without waiting for quickSort).

Related

fmt.Println is not executing in order when used without sleep

I'm trying to understand, why my code doesn't behave as I expect it. The problem is that I would expect, that my code would behave like that:
Define channel
Run goroutine and start looping
Put value into channel, print "finished"
Starting second iteration, blocking call(there is already value in the channel), move to main goroutine
Printing 1, trying to run second iteration, blocking call for main goroutine, coming back to second goroutine
Cycle repeats
It works like that, but only with time.Sleep, but for some reason when commenting out time.Sleep it behaves totally different. What's even more interesting that sometimes for really small values of time like Nanos etc this code returns even more different results. Could someone explain me, why it works like that? My guess is that maybe Println is too slow on display, but it sounds weird to me..
Thanks
** As expected: **
finished
1
finished
2
finished
3
finished
6
finished
4
finished
8
finished all
** Not expected **
finished
1
2
finished
finished
3
finished
6
4
finished
finished
8
finished all
func main() {
var c chan int = make(chan int)
go sendingThrowingResults(c)
for val := range c {
fmt.Println(val)
}
fmt.Println("finished all")
}
func sendingThrowingResults(c chan int) {
var results []int = []int{1, 2, 3, 6, 4, 8}
for _, val := range results {
//time.Sleep(100 * time.Millisecond)
c <- val
fmt.Println("finished")
}
defer close(c)
}
A channel operation needs both sides to participate. A write only happens when a reader is ready. Once that happens, there is no guarantee on which goroutine will run first.
Thus, once the channel write happens, one of the two printlns will work, in some random order.

It's concurrent, but what makes it run in parallel?

So I'm trying to understand how parallel computing works while also learning Go. I understand the difference between concurrency and parallelism, however, what I'm a little stuck on is how Go (or the OS) determines that something should be executed in parallel...
Is there something I have to do when writing my code, or is it all handled by the schedulers?
In the example below, I have two functions that are run in separate Go routines using the go keyword. Because the default GOMAXPROCS is the number of processors available on your machine (and I'm also explicitly setting it) I would expect that these two functions run at the same time and thus the output would be a mix of number in particular order - And furthermore that each time it is run the output would be different. However, this is not the case. Instead, they are running one after the other and to make matters more confusing function two is running before function one.
Code:
func main() {
runtime.GOMAXPROCS(6)
var wg sync.WaitGroup
wg.Add(2)
fmt.Println("Starting")
go func() {
defer wg.Done()
for smallNum := 0; smallNum < 20; smallNum++ {
fmt.Printf("%v ", smallNum)
}
}()
go func() {
defer wg.Done()
for bigNum := 100; bigNum > 80; bigNum-- {
fmt.Printf("%v ", bigNum)
}
}()
fmt.Println("Waiting to finish")
wg.Wait()
fmt.Println("\nFinished, Now terminating")
}
Output:
go run main.go
Starting
Waiting to finish
100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Finished, Now terminating
I am following this article, although just about every example I've come across does something similar.
Concurrency, Goroutines and GOMAXPROCS
Is this working the way is should and I'm not understanding something correctly, or is my code not right?
Is there something I have to do when writing my code,
No.
or is it all handled by the schedulers?
Yes.
In the example below, I have two functions that are run in separate Go routines using the go keyword. Because the default GOMAXPROCS is the number of processors available on your machine (and I'm also explicitly setting it) I would expect that these two functions run at the same time
They might or might not, you have no control here.
and thus the output would be a mix of number in particular order - And furthermore that each time it is run the output would be different. However, this is not the case. Instead, they are running one after the other and to make matters more confusing function two is running before function one.
Yes. Again you cannot force parallel computation.
Your test is flawed: You just don't do much in each goroutine. In your example goroutine 2 might be scheduled to run, starts running and completes before goroutine 1 started running. "Starting" a goroutine with go doesn't force it to start executing right away, all there is done is creating a new goroutine which can run. From all goroutines which can run some are scheduled onto your processors. All this scheduling cannot be controlled, it is fully automatic. As you seem to know this is the difference between concurrent and parallel. You have control over concurrency in Go but not (much) on what is done actually in parallel on two or more cores.
More realistic examples with actual, long-running goroutines which do actual work will show interleaved output.
It's all handled by the scheduler.
With only two loops of 20 short instructions, you will be hard pressed to see the effects of concurrency or parallelism.
Here is another toy example : https://play.golang.org/p/xPKITzKACZp
package main
import (
"fmt"
"runtime"
"sync"
"sync/atomic"
"time"
)
const (
ConstMaxProcs = 2
ConstRunners = 4
ConstLoopcount = 1_000_000
)
func runner(id int, wg *sync.WaitGroup, cptr *int64) {
var times int
for i := 0; i < ConstLoopcount; i++ {
val := atomic.AddInt64(cptr, 1)
if val > 1 {
times++
}
atomic.AddInt64(cptr, -1)
}
fmt.Printf("[runner %d] cptr was > 1 on %d occasions\n", id, times)
wg.Done()
}
func main() {
runtime.GOMAXPROCS(ConstMaxProcs)
var cptr int64
wg := &sync.WaitGroup{}
wg.Add(ConstRunners)
start := time.Now()
for id := 1; id <= ConstRunners; id++ {
go runner(id, wg, &cptr)
}
wg.Wait()
fmt.Printf("completed in %s\n", time.Now().Sub(start))
}
As with your example : you don't have control on the scheduler, this example just has more "surface" to witness some effects of concurrency.
It's hard to witness the actual difference between concurrency and parallelism from within the program, you can view your processor's activity while it runs, or check the global execution time.
The playground does not give sub-second precision on its clock, if you want to see the actual timing, copy/paste the code in a local file and tune the constants to see various effects.
Note that some other effects (probably : branch prediction on the if val > 1 {...} check and/or memory invalidation around the shared cptr variable) make the execution very volatile on my machine, so don't expect a straight "running with ConstMaxProcs = 4 is 4 times quicker than ConstMaxProcs = 1".

Why doesn't this buffered channel block in my code?

I'm learning the Go language. Can someone please explain the output here?
package main
import "fmt"
var c = make(chan int, 1)
func f() {
c <- 1
fmt.Println("In f()")
}
func main() {
go f()
c <- 2
fmt.Println(<-c)
fmt.Println(<-c)
}
Output:
In f()
2
1
Process finished with exit code 0
Why did "In f()" occur before "2"? If "In f()" is printed before "2", the buffered channel should block. But this didn't happen, why?
The other outputs are reasonable.
Image of my confusing result
The order of events that cause this to run through are these:
Trigger goroutine.
Write 2 to the channel. Capacity of the channel is now exhausted.
Read from the channel and output result.
Write 1 to the channel.
Read from the channel and output result.
The order of events that cause this to deadlock are these:
Trigger goroutine.
Write 1 to the channel. Capacity of the channel is now exhausted.
Write 2 to the channel. Since the channel's buffer is full, this blocks.
The output you provide seems to indicate that the goroutine finishes first and the program doesn't deadlock, which contradicts above two scenarios as explanation. Here's what happens though:
Trigger goroutine.
Write 2 to the channel.
Read 2 from the channel.
Write 1 to the channel.
Output In f().
Output the 2 received from the channel.
Read 1 from the channel.
Output the 1 received from the channel.
Keep in mind that you don't have any guarantees concerning the scheduling of goroutines unless you programmatically enforce them. When you start a goroutine, it is undefined when the first code of that goroutine are actually executed and how much the starting code progresses until then. Note that since your code relies on a certain order of events, it is broken by that definition, just to say that explicitly.
Also, at any point in your program, the scheduler can decide to switch between different goroutines. Even the single line fmt.Printtln(<-c) consists of multiple steps and in between each step the switch can occur.
To reproduce block you have to run that code many times the easiest way to do it with test. You don't have block cause of luck. But it is not guaranteed:
var c = make(chan int, 1)
func f() {
c <- 1
fmt.Println("In f()")
}
func TestF(t *testing.T) {
go f()
c <- 2
fmt.Println(<-c)
fmt.Println(<-c)
}
and run with command:
go test -race -count=1000 -run=TestF -timeout=4s
where count number of tests. It reproduces blocking to me.
Provide timeout to not wait default 10 minutes

Run a benchmark in parallel, i.e. simulate simultaneous requests

When testing a database procedure invoked from an API, when it runs sequentially, it seems to run consistently within ~3s. However we've noticed that when several requests come in at the same time, this can take much longer, causing time outs. I am trying to reproduce the "several requests at one time" case as a go test.
I tried the -parallel 10 go test flag, but the timings were the same at ~28s.
Is there something wrong with my benchmark function?
func Benchmark_RealCreate(b *testing.B) {
b.ResetTimer()
for n := 0; n < b.N; n++ {
name := randomdata.SillyName()
r := gofight.New()
u := []unit{unit{MefeUnitID: name, MefeCreatorUserID: "user", BzfeCreatorUserID: 55, ClassificationID: 2, UnitName: name, UnitDescriptionDetails: "Up on the hills and testing"}}
uJSON, _ := json.Marshal(u)
r.POST("/create").
SetBody(string(uJSON)).
Run(h.BasicEngine(), func(r gofight.HTTPResponse, rq gofight.HTTPRequest) {
assert.Contains(b, r.Body.String(), name)
assert.Equal(b, http.StatusOK, r.Code)
})
}
}
Else how I can achieve what I am after?
The -parallel flag is not for running the same test or benchmark parallel, in multiple instances.
Quoting from Command go: Testing flags:
-parallel n
Allow parallel execution of test functions that call t.Parallel.
The value of this flag is the maximum number of tests to run
simultaneously; by default, it is set to the value of GOMAXPROCS.
Note that -parallel only applies within a single test binary.
The 'go test' command may run tests for different packages
in parallel as well, according to the setting of the -p flag
(see 'go help build').
So basically if your tests allow, you can use -parallel to run multiple distinct testing or benchmark functions parallel, but not the same one in multiple instances.
In general, running multiple benchmark functions parallel defeats the purpose of benchmarking a function, because running it parallel in multiple instances usually distorts the benchmarking.
However, in your case code efficiency is not what you want to measure, you want to measure an external service. So go's built-in testing and benchmarking facilities are not really suitable.
Of course we could still use the convenience of having this "benchmark" run automatically when our other tests and benchmarks run, but you should not force this into the conventional benchmarking framework.
First thing that comes to mind is to use a for loop to launch n goroutines which all attempt to call the testable service. One problem with this is that this only ensures n concurrent goroutines at the start, because as the calls start to complete, there will be less and less concurrency for the remaining ones.
To overcome this and truly test n concurrent calls, you should have a worker pool with n workers, and continously feed jobs to this worker pool, making sure there will be n concurrent service calls at all times. For a worker pool implementation, see Is this an idiomatic worker thread pool in Go?
So all in all, fire up a worker pool with n workers, have a goroutine send jobs to it for an arbitrary time (e.g. for 30 seconds or 1 minute), and measure (count) the completed jobs. The benchmark result will be a simple division.
Also note that for solely testing purposes, a worker pool might not even be needed. You can just use a loop to launch n goroutines, but make sure each started goroutine keeps calling the service and not return after a single call.
I'm new to go, but why don't you try to make a function and run it using the standard parallel test?
func Benchmark_YourFunc(b *testing.B) {
b.RunParralel(func(pb *testing.PB) {
for pb.Next() {
YourFunc(staff ...T)
}
})
}
Your example code mixes several things. Why are you using assert there? This is not a test it is a benchmark. If the assert methods are slow, your benchmark will be.
You also moved the parallel execution out of your code into the test command. You should try to make a parallel request by using concurrency. Here just a possibility how to start:
func executeRoutines(routines int) {
wg := &sync.WaitGroup{}
wg.Add(routines)
starter := make(chan struct{})
for i := 0; i < routines; i++ {
go func() {
<-starter
// your request here
wg.Done()
}()
}
close(starter)
wg.Wait()
}
https://play.golang.org/p/ZFjUodniDHr
We start some goroutines here, which are waiting until starter is closed. So you can set your request direct after that line. That the function waits until all the requests are done we are using a WaitGroup.
BUT IMPORTANT: Go just supports concurrency. So if your system has not 10 cores the 10 goroutines will not run parallel. So ensure that you have enough cores availiable.
With this start you can play a little bit. You could start to call this function inside your benchmark. You could also play around with the numbers of goroutines.
As the documentation indicates, the parallel flag is to allow multiple different tests to be run in parallel. You generally do not want to run benchmarks in parallel because that would run different benchmarks at the same time, throwing off the results for all of them. If you want to benchmark parallel traffic, you need to write parallel traffic generation into your test. You need to decide how this should work with b.N which is your work factor; I would probably use it as the total request count, and write a benchmark or multiple benchmarks testing different concurrent load levels, e.g.:
func Benchmark_RealCreate(b *testing.B) {
concurrencyLevels := []int{5, 10, 20, 50}
for _, clients := range concurrencyLevels {
b.Run(fmt.Sprintf("%d_clients", clients), func(b *testing.B) {
sem := make(chan struct{}, clients)
wg := sync.WaitGroup{}
for n := 0; n < b.N; n++ {
wg.Add(1)
go func() {
name := randomdata.SillyName()
r := gofight.New()
u := []unit{unit{MefeUnitID: name, MefeCreatorUserID: "user", BzfeCreatorUserID: 55, ClassificationID: 2, UnitName: name, UnitDescriptionDetails: "Up on the hills and testing"}}
uJSON, _ := json.Marshal(u)
sem <- struct{}{}
r.POST("/create").
SetBody(string(uJSON)).
Run(h.BasicEngine(), func(r gofight.HTTPResponse, rq gofight.HTTPRequest) {})
<-sem
wg.Done()
}()
}
wg.Wait()
})
}
}
Note here I removed the initial ResetTimer; the timer doesn't start until you benchmark function is called, so calling it as the first op in your function is pointless. It's intended for cases where you have time-consuming setup prior to the benchmark loop that you don't want included in the benchmark results. I've also removed the assertions, because this is a benchmark, not a test; assertions are for validity checking in tests and only serve to throw off timing results in benchmarks.
One thing is benchmarking (measuring time code takes to run) another one is load/stress testing.
The -parallel flag as stated above, is to allow a set of tests to execute in parallel, allowing the test set to execute faster, not to execute some test N times in parallel.
But is simple to achieve what you want (execution of same test N times). Bellow a very simple (really quick and dirty) example just to clarify/demonstrate the important points, that gets this very specific situation done:
You define a test and mark it to be executed in parallel => TestAverage with a call to t.Parallel
You then define another test and use RunParallel to execute the number of instances of the test (TestAverage) you want.
The class to test:
package math
import (
"fmt"
"time"
)
func Average(xs []float64) float64 {
total := float64(0)
for _, x := range xs {
total += x
}
fmt.Printf("Current Unix Time: %v\n", time.Now().Unix())
time.Sleep(10 * time.Second)
fmt.Printf("Current Unix Time: %v\n", time.Now().Unix())
return total / float64(len(xs))
}
The testing funcs:
package math
import "testing"
func TestAverage(t *testing.T) {
t.Parallel()
var v float64
v = Average([]float64{1,2})
if v != 1.5 {
t.Error("Expected 1.5, got ", v)
}
}
func TestTeardownParallel(t *testing.T) {
// This Run will not return until the parallel tests finish.
t.Run("group", func(t *testing.T) {
t.Run("Test1", TestAverage)
t.Run("Test2", TestAverage)
t.Run("Test3", TestAverage)
})
// <tear-down code>
}
Then just do a go test and you should see:
X:\>go test
Current Unix Time: 1556717363
Current Unix Time: 1556717363
Current Unix Time: 1556717363
And 10 secs after that
...
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717383
PASS
ok _/X_/y 20.259s
The two extra lines, in the end are because TestAverage is executed also.
The interesting point here: if you remove t.Parallel() from TestAverage, it will all be execute sequencially:
X:> go test
Current Unix Time: 1556717564
Current Unix Time: 1556717574
Current Unix Time: 1556717574
Current Unix Time: 1556717584
Current Unix Time: 1556717584
Current Unix Time: 1556717594
Current Unix Time: 1556717594
Current Unix Time: 1556717604
PASS
ok _/X_/y 40.270s
This can of course be made more complex and extensible...

why golang sched runs differently when the variable is odd or even? Is it coincidence or doomed?

I have a sample of golang code as follows(xx.go):
package main
import "runtime"
func main() {
c2 := make(chan int)
go func() {
for v := range c2 {
println("c2 =", v, "numof routines:", runtime.NumGoroutine())
}
}()
for i:=1;i<=10001;i++{
c2 <- i
//runtime.Gosched()
}
}
When the loop count is odd,say 10001,the code will output all the numbers.
when the loop count is even ,say 10000,the code will output all the numbers but the last!
Why is that?
I have tested numbers from as small as 2 to as large as 10000,they all obey the above rules!
the ENV is as follows:
uname -a : Linux hadoopnode25232 2.6.18-308.16.1.el5 #1 SMP Tue Oct 2 22:01:43 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
go version : go version go1.1 linux/amd64
I think it has something to do with the go sched.
I assemble the code:
go tool 6g -S xx.go > xx.s
the only difference between 10000 and 10001 is :
33c33
< 0030 (xx.go:20) CMPQ AX,$10001
---
> 0030 (xx.go:20) CMPQ AX,$10000
And,last but not least, when I add runtime.Gosched(),everything runs well.
When main returns, the program terminates. It will not wait for any goroutines to finish. So whether all goroutines finish in time or not depends on the scheduler, and by extension on a fair bit of randomness, coincidence and external factors. The difference between 1e4 and 1e4+1 iterations might be one of those factors that affects the scheduling in just that one bit that makes the goroutine finish in time.
If you really require the goroutine to finish before exitting, wait for it to finish, by employing for example a sync.WaitGroup.
Unrelated to the actual issue, your code is overly complex and unidiomatic for what it's doing. You could rewrite the goroutine function as follows:
for v := range c2 {
println("c2 =", v,"numof routines:",runtime.NumGoroutine())
}
You need sleep a second or block a while wait the goroutine is finish!!
Go statements
A "go" statement starts the execution of a function call as an
independent concurrent thread of control, or goroutine, within the
same address space.
GoStmt = "go" Expression .
The function value and parameters are evaluated as usual in the
calling goroutine, but unlike with a regular call, program execution
does not wait for the invoked function to complete.
Once the final send, c2 <- i, is complete the program may terminate, it does not have to wait for the invoked function to complete. You are only guaranteed to complete executing the receive, v, ok := <-c2, c - 1 times, where c is your loop count, 10000 or 10001 in your tests.

Resources