Measuring time around a function is easy in Go.
But what if you need to measure it 5000 times per second in parallel?
I'm referring to Correctly measure time duration in Go which contains great answers about how to measure time in Go.
What is the cost of using time.Now() 5000 times per second or more?
While it may depend on the underlying OS, let's consider on linux.
Time measurement depends on the programming language and its implementation, the operating system and its implementation, the hardware architecture, implementation, and speed, and so on.
You need to focus on facts, not speculation. In Go, start with some benchmarks. For example,
since_test.go:
package main
import (
"testing"
"time"
)
var now time.Time
func BenchmarkNow(b *testing.B) {
for N := 0; N < b.N; N++ {
now = time.Now()
}
}
var since time.Duration
var start time.Time
func BenchmarkSince(b *testing.B) {
for N := 0; N < b.N; N++ {
start = time.Now()
since = time.Since(start)
}
}
Output:
$ go test since_test.go -bench=. -benchtime=1s
goos: linux
goarch: amd64
BenchmarkNow-4 30000000 47.5 ns/op
BenchmarkSince-4 20000000 98.1 ns/op
PASS
ok command-line-arguments 3.536s
$ go version
go version devel +48c4eeeed7 Sun Mar 25 08:33:21 2018 +0000 linux/amd64
$ uname -srvio
Linux 4.13.0-37-generic #42-Ubuntu SMP Wed Mar 7 14:13:23 UTC 2018 x86_64 GNU/Linux
$ cat /proc/cpuinfo | grep 'model name' | uniq
model name : Intel(R) Core(TM) i7-7500U CPU # 2.70GHz
$
Now, ask yourself if 5,000 times per second is necessary, practical, and reasonable.
What are your benchmark results?
Related
EDIT: Turns out this is not easy to reproduce. I think it may be a Host OS or Linux Distro issue. Trying it with different distros in Docker, but running on the same host, produces the same results.
EDIT 2: I altered the code in main.go. Based on the comments saying that it was the GC causing this. Now it should definitely be rewriting the value over the previous one. So there shouldn't be any GC.
Basically there's 2 things I am trying to understand.
The main thing I am trying to understand why these two functions, which do the same thing in the end, are so different in speed. One of them I make the array the size of the number of hashes I want (which uses more memory), and the other one is just a for loop.
How could I speed up the regular for loop function to gain the speed benefit without using the extra memory? (if possible)
Results of Benchmark on my system:
The Array function completes in 3.6 seconds and the loop function takes 6.4 seconds.
go version go1.19.2 linux/amd64
goos: linux
goarch: amd64
pkg: github.com/gngenius02/shardedmapdb
cpu: AMD Ryzen 9 5900X 12-Core Processor
BenchmarkGetHashesArray10Million-24 1 3662003126 ns/op 2080037632 B/op 30000051 allocs/op
BenchmarkGetHashesLoop10Million-24 1 6462627155 ns/op 1920001352 B/op 30000022 allocs/op
PASS
The two functions in question are GetHashUsingArray and GetHashUsingLoop.
main.go:
package main
import (
"crypto/sha256"
"encoding/hex"
)
type HashArray []string
type HS struct {
LastHash string
HashList HashArray
}
func (h *HS) GetHashUsingArray() {
hashit := func(s string) string {
digest := sha256.Sum256([]byte(s))
return hex.EncodeToString(digest[:])
}
hl := h.HashList
for i := 1; i < len(hl); i++ {
(hl)[i] = hashit((hl)[i-1])
}
h.LastHash = hl[len(hl)-1]
}
func GetHashUsingLoop(s string, loops int) string {
hashit := func(s *string) {
digest := sha256.Sum256([]byte(*s))
*s = hex.EncodeToString(digest[:])
}
hash := s
for i := 0; i < loops; i++ {
hashit(&hash)
}
return hash
}
func main() {}
main_test.go:
package main
import (
"testing"
)
func BenchmarkGetHashUsingArray10Million(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
firstValue := "abc"
hs := HS{"", make(HashArray, 10_000_001)}
hs.HashList[0] = firstValue
hs.GetHashUsingArray()
if hs.LastHash != "bf34d93b4be2a313b06cdf9d805c5f3d140abd872c37199701fb1e43fe479923" {
b.Error("Unexpected Result: " + hs.LastHash)
}
}
}
func BenchmarkGetHashUsingLoop10Million(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
firstValue := "abc"
result := GetHashUsingLoop(firstValue, 10_000_000)
if result != "bf34d93b4be2a313b06cdf9d805c5f3d140abd872c37199701fb1e43fe479923" {
b.Error("Unexpected result: " + result)
}
}
}
I think it's somehow relates to either to your own hardware or go version. On my machine results are completely different:
go version go1.18.1 darwin/arm64
% go test -bench=. -v
goos: darwin
goarch: arm64
pkg: github.com/foo/bar
BenchmarkGetHashUsingArray10Million
BenchmarkGetHashUsingArray10Million-8 1 1658941791 ns/op 2080012144 B/op 30000012 allocs/op
BenchmarkGetHashUsingLoop10Million
BenchmarkGetHashUsingLoop10Million-8 1 1391175042 ns/op 1920005816 B/op 30000063 allocs/op
PASS
maybe worth checking with go 1.18 to narrow the scope. I.e. if it will be the same difference with go 1.19 then it's hardware. If difference is not that huge then it's something that was introduced in go 1.19
Is it good practice (at least general practice) to have log.SetFlags(log.LstdFlags | log.Lshortfile) in production in Go? I wonder if there is whether performance or security issue by doing it in production. Since it is not default setting of log package in Go. Still can't find any official reference or even opinion article regarding that matter.
As for the performance. Yes, it has an impact, however, it is imho negligible for various reasons.
Testing
Code
package main
import (
"io/ioutil"
"log"
"testing"
)
func BenchmarkStdLog(b *testing.B) {
// We do not want to benchmark the shell
stdlog := log.New(ioutil.Discard, "", log.LstdFlags)
for i := 0; i < b.N; i++ {
stdlog.Println("foo")
}
}
func BenchmarkShortfile(b *testing.B) {
slog := log.New(ioutil.Discard, "", log.LstdFlags|log.Lshortfile)
for i := 0; i < b.N; i++ {
slog.Println("foo")
}
}
Result
goos: darwin
goarch: amd64
pkg: stackoverflow.com/go/logbench
BenchmarkStdLog-4 3803840 277 ns/op 4 B/op 1 allocs/op
BenchmarkShortfile-4 1000000 1008 ns/op 224 B/op 3 allocs/op
Your mileage may vary, but the order of magnitude should be roughly equal.
Why I think the impact is negligible
It is unlikely that your logging will be the bottleneck of your application, unless you write a shitload of logs. In 99 times out of 100, it is not the logging which is the bottleneck.
Get your application up and running, load test and profile it. You can still optimize then.
Hint: make sure you can scale out.
I ran the go code following.
package main
import (
"fmt"
"strconv"
"time"
)
func main() {
i, err := strconv.ParseInt("1405544146", 10, 64)
if err != nil {
panic(err)
}
tm := time.Unix(i, 0).Format(time.RFC3339)
fmt.Println(tm)
fmt.Println(time.RFC3339)
}
Then the result on Linux is
2014-07-16T20:55:46Z
2006-01-02T15:04:05Z07:00
and on macOS is
2014-07-17T05:55:46+09:00
2006-01-02T15:04:05Z07:00
It's the same time but formatted results are different. Do you know the reason?
Don't jump to conclusions. Examine all the evidence. For instance, consider the local time zone.
Package time
import "time"
func Unix
func Unix(sec int64, nsec int64) Time
Unix returns the local Time corresponding to the given Unix time, sec
seconds and nsec nanoseconds since January 1, 1970 UTC.
For example,
package main
import (
"fmt"
"runtime"
"strconv"
"time"
)
func main() {
i, err := strconv.ParseInt("1405544146", 10, 64)
if err != nil {
panic(err)
}
t := time.Unix(i, 0)
fmt.Println(t)
fmt.Println(t.Format(time.RFC3339))
fmt.Println(time.RFC3339)
fmt.Println(runtime.GOOS, runtime.GOARCH, runtime.Version())
}
Playground: https://play.golang.org/p/UH6o57YckiV
Output (Playground):
2014-07-16 20:55:46 +0000 UTC
2014-07-16T20:55:46Z
2006-01-02T15:04:05Z07:00
nacl amd64p32 go1.12
Output (Linux):
2014-07-16 16:55:46 -0400 EDT
2014-07-16T16:55:46-04:00
2006-01-02T15:04:05Z07:00
linux amd64 devel +5b68cb65d3 Thu Mar 28 23:49:52 2019 +0000
Different time zones (UTC versus EDT) so different formatted dates and times.
In your examples you have 2014-07-16T20:55:46Z and 2014-07-17T05:55:46+09:00, different time zones so different formatted dates and times.
2014-07-16T20:55:46Z and 2014-07-17T05:55:46+09:00 different zone time.
time.RFC3339 is const. https://golang.org/pkg/time/#RFC3339.
const (
// ...
RFC3339 = "2006-01-02T15:04:05Z07:00"
RFC3339Nano = "2006-01-02T15:04:05.999999999Z07:00"
// ...
)
go uses Numeric time or Z triggers the ISO 8601.
Numeric time zone offsets format as follows:
-0700 ±hhmm
-07:00 ±hh:mm
-07 ±hh
and Replacing the sign in the format with a Z triggers the ISO 8601 behavior of printing Z instead of an offset for the UTC zone.
Z0700 Z or ±hhmm
Z07:00 Z or ±hh:mm
Z07 Z or ±hh
The reason two different OS's produce differing outputs for the same input is because of the OS's timezone configuration. The timezone on the Linux box does not appear to be set.
go will attempt to pick up the timezone from the local OS. If not available, it defaults to UTC. If you want to get consistent output from both MacOS and Linux - ensure the timezone is the same. To do this explicitly, set TZ environment variable. This will work on Linux & MacOS etc.
$ go build -o gotime ./main.go
$ uname -s
Linux
$ TZ=CET ./gotime
2014-07-16T22:55:46+02:00
2006-01-02T15:04:05Z07:00
$ TZ="" ./gotime
2014-07-16T20:55:46Z
2006-01-02T15:04:05Z07:00
$ uname -s
Darwin
$ TZ=CET ./gotime
2014-07-16T22:55:46+02:00
2006-01-02T15:04:05Z07:00
$ TZ="" ./gotime
2014-07-16T20:55:46Z
2006-01-02T15:04:05Z07:00
This problem is due to time location on golang.
Default time location of Linux golang is "UTC" and MacOS golang is "Local".
("Local" use time location of os.)
You can check time location.
fmt.Println(time.Local) // Result of MacOS is "Local"
If you want to get "UTC" results from MacOS, you should use "In" function.
lo, _ := time.LoadLocation("UTC")
tmUTC := time.Unix(1405544146, 0).In(lo).Format(time.RFC3339)
if you run code on MacOS, you can get result like linux.
i, err := strconv.ParseInt("1405544146", 10, 64)
if err != nil {
panic(err)
}
tm := time.Unix(i, 0).Format(time.RFC3339)
fmt.Println(tm)
lo, _ := time.LoadLocation("UTC")
tmUTC := time.Unix(i, 0).In(lo).Format(time.RFC3339)
I'm kind of a newbie in Go and there is something that confused me recently.
I have a piece of code (simplified version posted below) and I was trying to measure performanc for it. I did this in two ways: 1) a bencmark with testing package 2) manually logging time
Running the benchmark outputs a result
30000 55603 ns/op
which is fine, BUT... When I do the 30k runs of the same function logging the time for each iteration I get an output like this:
test took 0 ns
test took 0 ns
... ~10 records all the same
test took 1000100 ns
test took 0 ns
test took 0 ns
... lots of zeroes again
test took 0 ns
test took 1000000 ns
test took 0 ns
...
Doing the math shows that the average is indeed 55603 ns/op just as the benchmark claims.
Ok, I said, I'm not that good in optimizing performance and not that into all the hardcore compiler stuff, but I guess that might be random garbage collection? So I turned on the gc log, made sure it shows some output, then turned off the gc for good aaand... no garbage collection, but I see the same picture - some iterations take a million times longer(?).
It is 99% that my understanding of all this is wrong somewhere, maybe someone can point me to the right direction or maybe someone knows for sure what the hell is going on? :)
P.S. Also, to me less that a nanosecond (0 ns) is somewhat surprising, that seems too fast, but the program does provide the result of computation, so I don't know what to think anymore. T_T
EDIT 1: Answering Kenny Grant's question: I was using goroutines to implement sort-of generator of values to have laziness, now I removed them and simplified the code. The issue is much less frequent now, but it is still reproducible.
Playground link: https://play.golang.org/p/UQMgtT4Jrf
Interesting thing is that does not happen on playground, but still happens on my machine.
EDIT 2: I'm running Go 1.9 on win7 x64
EDIT 3: Thanks to the responses I now know that this code cannot possible work properly on playground. I will repost the code snippet here so that we don't loose it. :)
type PrefType string
var types []PrefType = []PrefType{
"TYPE1", "TYPE2", "TYPE3", "TYPE4", "TYPE5", "TYPE6",
}
func GetKeys(key string) []string {
var result []string
for _, t := range types {
rr := doCalculations(t)
for _, k := range rr {
result = append(result, key + "." + k)
}
}
return result
}
func doCalculations(prefType PrefType) []string {
return []string{ string(prefType) + "something", string(prefType) + "else" }
}
func test() {
start := time.Now()
keysPrioritized := GetKeys("spec_key")
for _, k := range keysPrioritized {
_ = fmt.Sprint(k)
}
fmt.Printf("test took %v ns\n", time.Since(start).Nanoseconds())
}
func main() {
for i := 0; i < 30000; i++ {
test()
}
}
Here is the output on my machine:
EDIT 4: I have tried the same on my laptop with Ubuntu 17.04, the output is reasonable, no zeroes and millions. Seems like a Windows-specific issue in the compiler/runtime lib. Would be great if someone can verify this on their machine (Win 7/8/10).
On Windows, for such a tiny duration, you don't have precise enough time stamps. Linux has more precise time stamps. By design, Go benchmarks run for at least one second. Go1.9+ uses the monotonic (m) value to compute the duration.
On Windows:
timedur.go:
package main
import (
"fmt"
"os"
"time"
)
type PrefType string
var types []PrefType = []PrefType{
"TYPE1", "TYPE2", "TYPE3", "TYPE4", "TYPE5", "TYPE6",
}
func GetKeys(key string) []string {
var result []string
for _, t := range types {
rr := doCalculations(t)
for _, k := range rr {
result = append(result, key+"."+k)
}
}
return result
}
func doCalculations(prefType PrefType) []string {
return []string{string(prefType) + "something", string(prefType) + "else"}
}
func test() {
start := time.Now()
keysPrioritized := GetKeys("spec_key")
for _, k := range keysPrioritized {
_ = fmt.Sprint(k)
}
end := time.Now()
fmt.Printf("test took %v ns\n", time.Since(start).Nanoseconds())
fmt.Println(start)
fmt.Println(end)
if end.Sub(start) < time.Microsecond {
os.Exit(1)
}
}
func main() {
for i := 0; i < 30000; i++ {
test()
}
}
Output:
>go run timedur.go
test took 1026000 ns
2017-09-02 14:21:58.1488675 -0700 PDT m=+0.010003700
2017-09-02 14:21:58.1498935 -0700 PDT m=+0.011029700
test took 0 ns
2017-09-02 14:21:58.1538658 -0700 PDT m=+0.015002000
2017-09-02 14:21:58.1538658 -0700 PDT m=+0.015002000
exit status 1
>
On Linux:
Output:
$ go run timedur.go
test took 113641 ns
2017-09-02 14:52:02.917175333 +0000 UTC m=+0.001041249
2017-09-02 14:52:02.917287569 +0000 UTC m=+0.001153717
test took 23614 ns
2017-09-02 14:52:02.917600301 +0000 UTC m=+0.001466208
2017-09-02 14:52:02.917623585 +0000 UTC m=+0.001489354
test took 22814 ns
2017-09-02 14:52:02.917726364 +0000 UTC m=+0.001592236
2017-09-02 14:52:02.917748805 +0000 UTC m=+0.001614575
test took 21139 ns
2017-09-02 14:52:02.917818409 +0000 UTC m=+0.001684292
2017-09-02 14:52:02.917839184 +0000 UTC m=+0.001704954
test took 21478 ns
2017-09-02 14:52:02.917911899 +0000 UTC m=+0.001777712
2017-09-02 14:52:02.917932944 +0000 UTC m=+0.001798712
test took 31032 ns
<SNIP>
The results are comparable. They were run on the same machine, a dual-boot with Windows 10 and Ubuntu 16.04.
Best to eliminate GC as obviously logging it is going to interfere with timings. The time pkg on playground is fake, so this won't work there. Trying it locally, I get no times of 0 ns with your code as supplied, it look like it is working as intended.
You should of course expect some variation in times - when I try it the results are all within the same order of magnitude (very small times of 0.000003779 s), but there is an occasional blip even if you do 30 runs, sometimes up to double - but running timings at this resolution is unlikely to give you reliable results as it depends what else is running on the computer, on memory layout etc. Better to try to time long running operations this way rather than very short times like this one and to time lots of operations and average them - this is why the benchmark tool gives you an average over so many runs.
Since the timings are for operations taking very little time, and are not wildly different, I think this is normal behaviour with the code supplied. The 0ns results are wrong but probably the result of your previous use of goroutines, hard to judge that without code as the code you provided doesn't give that result.
I have 2 methods to trim the domain suffix from a subdomain and I'd like to find out which one is faster. How do I do that?
2 string trimming methods
You can use the builtin benchmark capabilities of go test.
For example (on play):
import (
"strings"
"testing"
)
func BenchmarkStrip1(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
s := strings.Index(host, ".")
_ = host[:s]
}
}
func BenchmarkStrip2(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
strings.TrimSuffix(host, ".domain.tld")
}
}
Store this code in somename_test.go and run go test -test.bench='.*'. For me this gives
the following output:
% go test -test.bench='.*'
testing: warning: no tests to run
PASS
BenchmarkStrip1 100000000 12.9 ns/op
BenchmarkStrip2 100000000 16.1 ns/op
ok 21614966 2.935s
The benchmark utility will attempt to do a certain number of runs until a meaningful time is
measured which is reflected in the output by the number 100000000. The code was run
100000000 times and each operation in the loop took 12.9 ns and 16.1 ns respectively.
So you can conclude that the code in BenchmarkStrip1 performed better.
Regardless of the outcome, it is often better to profile your program to see where the
real bottleneck is instead of wasting your time with micro benchmarks like these.
I would also not recommend writing your own benchmarking as there are some factors you might
not consider such as the garbage collector and running your samples long enough.