Random drop in performance

Random drop in performance - windows

I'm kind of a newbie in Go and there is something that confused me recently.
I have a piece of code (simplified version posted below) and I was trying to measure performanc for it. I did this in two ways: 1) a bencmark with testing package 2) manually logging time
Running the benchmark outputs a result
30000 55603 ns/op
which is fine, BUT... When I do the 30k runs of the same function logging the time for each iteration I get an output like this:
test took 0 ns
test took 0 ns
... ~10 records all the same
test took 1000100 ns
test took 0 ns
test took 0 ns
... lots of zeroes again
test took 0 ns
test took 1000000 ns
test took 0 ns
...
Doing the math shows that the average is indeed 55603 ns/op just as the benchmark claims.
Ok, I said, I'm not that good in optimizing performance and not that into all the hardcore compiler stuff, but I guess that might be random garbage collection? So I turned on the gc log, made sure it shows some output, then turned off the gc for good aaand... no garbage collection, but I see the same picture - some iterations take a million times longer(?).
It is 99% that my understanding of all this is wrong somewhere, maybe someone can point me to the right direction or maybe someone knows for sure what the hell is going on? :)
P.S. Also, to me less that a nanosecond (0 ns) is somewhat surprising, that seems too fast, but the program does provide the result of computation, so I don't know what to think anymore. T_T
EDIT 1: Answering Kenny Grant's question: I was using goroutines to implement sort-of generator of values to have laziness, now I removed them and simplified the code. The issue is much less frequent now, but it is still reproducible.
Playground link: https://play.golang.org/p/UQMgtT4Jrf
Interesting thing is that does not happen on playground, but still happens on my machine.
EDIT 2: I'm running Go 1.9 on win7 x64
EDIT 3: Thanks to the responses I now know that this code cannot possible work properly on playground. I will repost the code snippet here so that we don't loose it. :)
type PrefType string
var types []PrefType = []PrefType{
"TYPE1", "TYPE2", "TYPE3", "TYPE4", "TYPE5", "TYPE6",
}
func GetKeys(key string) []string {
var result []string
for _, t := range types {
rr := doCalculations(t)
for _, k := range rr {
result = append(result, key + "." + k)
}
}
return result
}
func doCalculations(prefType PrefType) []string {
return []string{ string(prefType) + "something", string(prefType) + "else" }
}
func test() {
start := time.Now()
keysPrioritized := GetKeys("spec_key")
for _, k := range keysPrioritized {
_ = fmt.Sprint(k)
}
fmt.Printf("test took %v ns\n", time.Since(start).Nanoseconds())
}
func main() {
for i := 0; i < 30000; i++ {
test()
}
}
Here is the output on my machine:
EDIT 4: I have tried the same on my laptop with Ubuntu 17.04, the output is reasonable, no zeroes and millions. Seems like a Windows-specific issue in the compiler/runtime lib. Would be great if someone can verify this on their machine (Win 7/8/10).

On Windows, for such a tiny duration, you don't have precise enough time stamps. Linux has more precise time stamps. By design, Go benchmarks run for at least one second. Go1.9+ uses the monotonic (m) value to compute the duration.
On Windows:
timedur.go:
package main
import (
"fmt"
"os"
"time"
)
type PrefType string
var types []PrefType = []PrefType{
"TYPE1", "TYPE2", "TYPE3", "TYPE4", "TYPE5", "TYPE6",
}
func GetKeys(key string) []string {
var result []string
for _, t := range types {
rr := doCalculations(t)
for _, k := range rr {
result = append(result, key+"."+k)
}
}
return result
}
func doCalculations(prefType PrefType) []string {
return []string{string(prefType) + "something", string(prefType) + "else"}
}
func test() {
start := time.Now()
keysPrioritized := GetKeys("spec_key")
for _, k := range keysPrioritized {
_ = fmt.Sprint(k)
}
end := time.Now()
fmt.Printf("test took %v ns\n", time.Since(start).Nanoseconds())
fmt.Println(start)
fmt.Println(end)
if end.Sub(start) < time.Microsecond {
os.Exit(1)
}
}
func main() {
for i := 0; i < 30000; i++ {
test()
}
}
Output:
>go run timedur.go
test took 1026000 ns
2017-09-02 14:21:58.1488675 -0700 PDT m=+0.010003700
2017-09-02 14:21:58.1498935 -0700 PDT m=+0.011029700
test took 0 ns
2017-09-02 14:21:58.1538658 -0700 PDT m=+0.015002000
2017-09-02 14:21:58.1538658 -0700 PDT m=+0.015002000
exit status 1
>
On Linux:
Output:
$ go run timedur.go
test took 113641 ns
2017-09-02 14:52:02.917175333 +0000 UTC m=+0.001041249
2017-09-02 14:52:02.917287569 +0000 UTC m=+0.001153717
test took 23614 ns
2017-09-02 14:52:02.917600301 +0000 UTC m=+0.001466208
2017-09-02 14:52:02.917623585 +0000 UTC m=+0.001489354
test took 22814 ns
2017-09-02 14:52:02.917726364 +0000 UTC m=+0.001592236
2017-09-02 14:52:02.917748805 +0000 UTC m=+0.001614575
test took 21139 ns
2017-09-02 14:52:02.917818409 +0000 UTC m=+0.001684292
2017-09-02 14:52:02.917839184 +0000 UTC m=+0.001704954
test took 21478 ns
2017-09-02 14:52:02.917911899 +0000 UTC m=+0.001777712
2017-09-02 14:52:02.917932944 +0000 UTC m=+0.001798712
test took 31032 ns
<SNIP>
The results are comparable. They were run on the same machine, a dual-boot with Windows 10 and Ubuntu 16.04.

Best to eliminate GC as obviously logging it is going to interfere with timings. The time pkg on playground is fake, so this won't work there. Trying it locally, I get no times of 0 ns with your code as supplied, it look like it is working as intended.
You should of course expect some variation in times - when I try it the results are all within the same order of magnitude (very small times of 0.000003779 s), but there is an occasional blip even if you do 30 runs, sometimes up to double - but running timings at this resolution is unlikely to give you reliable results as it depends what else is running on the computer, on memory layout etc. Better to try to time long running operations this way rather than very short times like this one and to time lots of operations and average them - this is why the benchmark tool gives you an average over so many runs.
Since the timings are for operations taking very little time, and are not wildly different, I think this is normal behaviour with the code supplied. The 0ns results are wrong but probably the result of your previous use of goroutines, hard to judge that without code as the code you provided doesn't give that result.

Related

Run a benchmark in parallel, i.e. simulate simultaneous requests

When testing a database procedure invoked from an API, when it runs sequentially, it seems to run consistently within ~3s. However we've noticed that when several requests come in at the same time, this can take much longer, causing time outs. I am trying to reproduce the "several requests at one time" case as a go test.
I tried the -parallel 10 go test flag, but the timings were the same at ~28s.
Is there something wrong with my benchmark function?
func Benchmark_RealCreate(b *testing.B) {
b.ResetTimer()
for n := 0; n < b.N; n++ {
name := randomdata.SillyName()
r := gofight.New()
u := []unit{unit{MefeUnitID: name, MefeCreatorUserID: "user", BzfeCreatorUserID: 55, ClassificationID: 2, UnitName: name, UnitDescriptionDetails: "Up on the hills and testing"}}
uJSON, _ := json.Marshal(u)
r.POST("/create").
SetBody(string(uJSON)).
Run(h.BasicEngine(), func(r gofight.HTTPResponse, rq gofight.HTTPRequest) {
assert.Contains(b, r.Body.String(), name)
assert.Equal(b, http.StatusOK, r.Code)
})
}
}
Else how I can achieve what I am after?

The -parallel flag is not for running the same test or benchmark parallel, in multiple instances.
Quoting from Command go: Testing flags:
-parallel n
Allow parallel execution of test functions that call t.Parallel.
The value of this flag is the maximum number of tests to run
simultaneously; by default, it is set to the value of GOMAXPROCS.
Note that -parallel only applies within a single test binary.
The 'go test' command may run tests for different packages
in parallel as well, according to the setting of the -p flag
(see 'go help build').
So basically if your tests allow, you can use -parallel to run multiple distinct testing or benchmark functions parallel, but not the same one in multiple instances.
In general, running multiple benchmark functions parallel defeats the purpose of benchmarking a function, because running it parallel in multiple instances usually distorts the benchmarking.
However, in your case code efficiency is not what you want to measure, you want to measure an external service. So go's built-in testing and benchmarking facilities are not really suitable.
Of course we could still use the convenience of having this "benchmark" run automatically when our other tests and benchmarks run, but you should not force this into the conventional benchmarking framework.
First thing that comes to mind is to use a for loop to launch n goroutines which all attempt to call the testable service. One problem with this is that this only ensures n concurrent goroutines at the start, because as the calls start to complete, there will be less and less concurrency for the remaining ones.
To overcome this and truly test n concurrent calls, you should have a worker pool with n workers, and continously feed jobs to this worker pool, making sure there will be n concurrent service calls at all times. For a worker pool implementation, see Is this an idiomatic worker thread pool in Go?
So all in all, fire up a worker pool with n workers, have a goroutine send jobs to it for an arbitrary time (e.g. for 30 seconds or 1 minute), and measure (count) the completed jobs. The benchmark result will be a simple division.
Also note that for solely testing purposes, a worker pool might not even be needed. You can just use a loop to launch n goroutines, but make sure each started goroutine keeps calling the service and not return after a single call.

I'm new to go, but why don't you try to make a function and run it using the standard parallel test?
func Benchmark_YourFunc(b *testing.B) {
b.RunParralel(func(pb *testing.PB) {
for pb.Next() {
YourFunc(staff ...T)
}
})
}

Your example code mixes several things. Why are you using assert there? This is not a test it is a benchmark. If the assert methods are slow, your benchmark will be.
You also moved the parallel execution out of your code into the test command. You should try to make a parallel request by using concurrency. Here just a possibility how to start:
func executeRoutines(routines int) {
wg := &sync.WaitGroup{}
wg.Add(routines)
starter := make(chan struct{})
for i := 0; i < routines; i++ {
go func() {
<-starter
// your request here
wg.Done()
}()
}
close(starter)
wg.Wait()
}
https://play.golang.org/p/ZFjUodniDHr
We start some goroutines here, which are waiting until starter is closed. So you can set your request direct after that line. That the function waits until all the requests are done we are using a WaitGroup.
BUT IMPORTANT: Go just supports concurrency. So if your system has not 10 cores the 10 goroutines will not run parallel. So ensure that you have enough cores availiable.
With this start you can play a little bit. You could start to call this function inside your benchmark. You could also play around with the numbers of goroutines.

As the documentation indicates, the parallel flag is to allow multiple different tests to be run in parallel. You generally do not want to run benchmarks in parallel because that would run different benchmarks at the same time, throwing off the results for all of them. If you want to benchmark parallel traffic, you need to write parallel traffic generation into your test. You need to decide how this should work with b.N which is your work factor; I would probably use it as the total request count, and write a benchmark or multiple benchmarks testing different concurrent load levels, e.g.:
func Benchmark_RealCreate(b *testing.B) {
concurrencyLevels := []int{5, 10, 20, 50}
for _, clients := range concurrencyLevels {
b.Run(fmt.Sprintf("%d_clients", clients), func(b *testing.B) {
sem := make(chan struct{}, clients)
wg := sync.WaitGroup{}
for n := 0; n < b.N; n++ {
wg.Add(1)
go func() {
name := randomdata.SillyName()
r := gofight.New()
u := []unit{unit{MefeUnitID: name, MefeCreatorUserID: "user", BzfeCreatorUserID: 55, ClassificationID: 2, UnitName: name, UnitDescriptionDetails: "Up on the hills and testing"}}
uJSON, _ := json.Marshal(u)
sem <- struct{}{}
r.POST("/create").
SetBody(string(uJSON)).
Run(h.BasicEngine(), func(r gofight.HTTPResponse, rq gofight.HTTPRequest) {})
<-sem
wg.Done()
}()
}
wg.Wait()
})
}
}
Note here I removed the initial ResetTimer; the timer doesn't start until you benchmark function is called, so calling it as the first op in your function is pointless. It's intended for cases where you have time-consuming setup prior to the benchmark loop that you don't want included in the benchmark results. I've also removed the assertions, because this is a benchmark, not a test; assertions are for validity checking in tests and only serve to throw off timing results in benchmarks.

One thing is benchmarking (measuring time code takes to run) another one is load/stress testing.
The -parallel flag as stated above, is to allow a set of tests to execute in parallel, allowing the test set to execute faster, not to execute some test N times in parallel.
But is simple to achieve what you want (execution of same test N times). Bellow a very simple (really quick and dirty) example just to clarify/demonstrate the important points, that gets this very specific situation done:
You define a test and mark it to be executed in parallel => TestAverage with a call to t.Parallel
You then define another test and use RunParallel to execute the number of instances of the test (TestAverage) you want.
The class to test:
package math
import (
"fmt"
"time"
)
func Average(xs []float64) float64 {
total := float64(0)
for _, x := range xs {
total += x
}
fmt.Printf("Current Unix Time: %v\n", time.Now().Unix())
time.Sleep(10 * time.Second)
fmt.Printf("Current Unix Time: %v\n", time.Now().Unix())
return total / float64(len(xs))
}
The testing funcs:
package math
import "testing"
func TestAverage(t *testing.T) {
t.Parallel()
var v float64
v = Average([]float64{1,2})
if v != 1.5 {
t.Error("Expected 1.5, got ", v)
}
}
func TestTeardownParallel(t *testing.T) {
// This Run will not return until the parallel tests finish.
t.Run("group", func(t *testing.T) {
t.Run("Test1", TestAverage)
t.Run("Test2", TestAverage)
t.Run("Test3", TestAverage)
})
// <tear-down code>
}
Then just do a go test and you should see:
X:\>go test
Current Unix Time: 1556717363
Current Unix Time: 1556717363
Current Unix Time: 1556717363
And 10 secs after that
...
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717373
Current Unix Time: 1556717383
PASS
ok _/X_/y 20.259s
The two extra lines, in the end are because TestAverage is executed also.
The interesting point here: if you remove t.Parallel() from TestAverage, it will all be execute sequencially:
X:> go test
Current Unix Time: 1556717564
Current Unix Time: 1556717574
Current Unix Time: 1556717574
Current Unix Time: 1556717584
Current Unix Time: 1556717584
Current Unix Time: 1556717594
Current Unix Time: 1556717594
Current Unix Time: 1556717604
PASS
ok _/X_/y 40.270s
This can of course be made more complex and extensible...

Why is time.Since returning negative durations on Windows?

I have been trying to work with some go, and have found some weird behavior on windows. If I construct a time object from parsing a time string in a particular format, and then use functions like time.Since(), I get negative durations.
Code sample:
package main
import (
"fmt"
"time"
"strconv"
)
func convertToTimeObject(dateStr string) time.Time {
layout := "2006-01-02T15:04:05.000Z"
t, _:= time.Parse(layout, dateStr)
return t
}
func main() {
timeOlder := convertToTimeObject(time.Now().Add(-30*time.Second).Format("2006-01-02T15:04:05.000Z"))
duration := time.Since(timeOlder)
fmt.Println("Duration in seconds: " + strconv.Itoa(int(duration.Seconds())))
}
If you run it on Linux or the Go Playground link, you get the result as Duration in seconds: 30 which is expected.
However, on Windows, running the same piece of code with Go 1.10.3 gives Duration in seconds: -19769.
I've banged my head on this for hours. Any help on what I might be missing?
The only leads I've had since now are that when go's time package goes to calculate the seconds for both time objects (time.Now() and my parsed time object), one of them has the property hasMonotonic and one doesn't, which results in go calculating vastly different seconds for both.
I'm not the expert in time, so would appreciate some help. I was going to file a bug for Go, but thought to ask here from the experts if there's something obvious I might be missing.

I think I figured out what the reason for the weird behavior of your code snippet is and can provide a solution. The relevant docs read as follows:
since returns the time elapsed since t. It is shorthand for time.Now().Sub(t).
But:
now returns the current local time.
That means you are formatting timeOlder and subtract it from an unformatted local time. That of course causes unexpected behavior. A simple solution is to parse the local time according to your format before subtracting timeOlder from it.
A solution that works on my machine (it probably does not make a lot of sense to give a playground example, though):
func convertToTimeObject(dateStr string) time.Time {
layout := "2006-01-02T15:04:05.000Z"
t, err := time.Parse(layout, dateStr)
// check the error!
if err != nil {
log.Fatalf("error while parsing time: %s\n", err)
}
return t
}
func main() {
timeOlder := convertToTimeObject(time.Now().Add(-30 * time.Second).Format("2006-01-02T15:04:05.000Z"))
duration := time.Since(timeOlder)
// replace time.Since() with a correctly parsed time.Now(), because
// time.Since() returns the time elapsed since the current LOCAL time.
t := time.Now().Format("2006-01-02T15:04:05.000Z")
timeNow := convertToTimeObject(t)
// print the different results
fmt.Println("duration in seconds:", strconv.Itoa(int(duration.Seconds())))
fmt.Printf("duration: %v\n", timeNow.Sub(timeOlder))
}
Outputs:
duration in seconds: 14430
duration: 30s

Strange behavior of time.Since()

I am running a GO (1.9.2) program and I have code similar to:
startTime := time.Now()
...
...
fmt.Printf("%v (1) %v \n", user.uid, int64(time.Since(startTime)))
fmt.Printf("%v (F) %v \n", user.uid, int64(time.Since(startTime)))
(The two fmt statements are on consecutive lines)
I expected that the printout would be of similar time but here are some results of the print:
921 (1) 2000100
921 (F) 3040173800
(3 seconds)
360 (1) 2000100
360 (F) 1063060800
(1 second)
447 (1) 4000200
447 (F) 2564146700
(2.5 seconds)
The time difference is consistently high between the two printouts.
What could be the explanation of this phenomenon?
Extra info:
According to pprof there are ~15000 goroutines running at the time of the prints but most of them are waiting for incoming data on sockets.
I ran the code with GODEBUG=gctrace=1 but there aren't many GC printouts, not nearly as many as the number of printouts of my code.
EDIT:
It seems that storing the result of time.Since() into variables as suggested by #Verran solves the issue.
Changing to fmt to log didn't help but the prints are no longer synchronized.
It appears the "problem" is in the way fmt is handled in a high load environment. I hope someone could shed some light to what is going on here.

Don't think the problem is related to your posted code. To find the problem I suggest printing mem stats like number of GC calls and how much was spent in GC between two prints. I suggest to print startTime too to be 100% sure that two consecutive print belongs to same goroutine.

If you print the duration units (you are printing nanoseconds), you will see, as expected, it's just a fraction of a monotonic second later. For example,
package main
import (
"fmt"
"time"
)
func main() {
var user_uid string
startTime := time.Now()
since := time.Since(startTime)
fmt.Printf("%v (1) %v %v\n", user_uid, int64(since), since)
since = time.Since(startTime)
fmt.Printf("%v (F) %v %v\n", user_uid, int64(since), since)
}
Output:
(1) 142 142ns
(F) 22036 22.036µs
time.Since and fmt.Printf yield the processor, allowing other goroutines to run.

Go: why does time.Now().Hour/Minute/Second return a six digit number?

I am learning how to code in Go and trying to create a simple reminder function.
I want to display the current time as a regular 24 hour clock, XX.XX (hours, minutes).
I have saved the current time in a variable t and when I print it I find out that the time is 23.00 early November 2009. Fine, but when I print t.Hour and t.Minute the result is 132288.132480.
It is something similar when I print t.Seconds. I have not been able to figure out why this happens.
Roughly 2000 days have passed since but that is only 48k hours and 2880k minutes so the small difference between the hours and minutes in my result hints that the issue is something else.
I am running the code in the go playground.
My code:
package main
import (
"fmt"
"time"
)
func main() {
Remind("It's time to eat")
}
func Remind(text string) {
t := time.Now()
fmt.Println(t)
fmt.Printf("The time is %d.%d: ", t.Hour, t.Minute)
fmt.Printf(text)
fmt.Println()
}

You need to call t.Hour() instead of using it as a value.
Check out the source of time package here: https://golang.org/src/time/time.go?s=12994:13018#L390
399 // Hour returns the hour within the day specified by t, in the range [0, 23].
400 func (t Time) Hour() int {
401 return int(t.abs()%secondsPerDay) / secondsPerHour
402 }
403
When in doubt, you can quickly find an explanation by reading specific package source from official go packages page.

How do I find out which of 2 methods is faster?

I have 2 methods to trim the domain suffix from a subdomain and I'd like to find out which one is faster. How do I do that?
2 string trimming methods

You can use the builtin benchmark capabilities of go test.
For example (on play):
import (
"strings"
"testing"
)
func BenchmarkStrip1(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
s := strings.Index(host, ".")
_ = host[:s]
}
}
func BenchmarkStrip2(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
strings.TrimSuffix(host, ".domain.tld")
}
}
Store this code in somename_test.go and run go test -test.bench='.*'. For me this gives
the following output:
% go test -test.bench='.*'
testing: warning: no tests to run
PASS
BenchmarkStrip1 100000000 12.9 ns/op
BenchmarkStrip2 100000000 16.1 ns/op
ok 21614966 2.935s
The benchmark utility will attempt to do a certain number of runs until a meaningful time is
measured which is reflected in the output by the number 100000000. The code was run
100000000 times and each operation in the loop took 12.9 ns and 16.1 ns respectively.
So you can conclude that the code in BenchmarkStrip1 performed better.
Regardless of the outcome, it is often better to profile your program to see where the
real bottleneck is instead of wasting your time with micro benchmarks like these.
I would also not recommend writing your own benchmarking as there are some factors you might
not consider such as the garbage collector and running your samples long enough.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio