Update
Actually it seems the benchmark was incorrectly setup I have followed the resource shared by user #Luke Joshua Park and now it works.
package main
import "testing"
func benchmarkBcrypt(i int, b *testing.B){
for n:= 0; n < b.N; n++ {
HashPassword("my pass", i)
}
}
func BenchmarkBcrypt9(b *testing.B){
benchmarkBcrypt(9, b)
}
func BenchmarkBcrypt10(b *testing.B){
benchmarkBcrypt(10, b)
}
func BenchmarkBcrypt11(b *testing.B){
benchmarkBcrypt(11, b)
}
func BenchmarkBcrypt12(b *testing.B){
benchmarkBcrypt(12, b)
}
func BenchmarkBcrypt13(b *testing.B){
benchmarkBcrypt(13, b)
}
func BenchmarkBcrypt14(b *testing.B){
benchmarkBcrypt(14, b)
}
Output:
BenchmarkBcrypt9-4 30 39543095 ns/op
BenchmarkBcrypt10-4 20 79184657 ns/op
BenchmarkBcrypt11-4 10 158688315 ns/op
BenchmarkBcrypt12-4 5 316070133 ns/op
BenchmarkBcrypt13-4 2 631838101 ns/op
BenchmarkBcrypt14-4 1 1275047344 ns/op
PASS
ok go-playground 10.670s
Old incorrect benchmark
I have a small set on benchmark test in golang and am curios of to what is a recommended bcrypt cost to use as of May 2018.
This is my benchrmark file:
package main
import "testing"
func BenchmarkBcrypt10(b *testing.B){
HashPassword("my pass", 10)
}
func BenchmarkBcrypt12(b *testing.B){
HashPassword("my pass", 12)
}
func BenchmarkBcrypt13(b *testing.B){
HashPassword("my pass", 13)
}
func BenchmarkBcrypt14(b *testing.B){
HashPassword("my pass", 14)
}
func BenchmarkBcrypt15(b *testing.B){
HashPassword("my pass", 15)
}
and this is HashPassword() func inside main.go:
import (
"golang.org/x/crypto/bcrypt"
)
func HashPassword(password string, cost int) (string, error) {
bytes, err := bcrypt.GenerateFromPassword([]byte(password), cost)
return string(bytes), err
}
The current output is:
go test -bench=.
BenchmarkBcrypt10-4 2000000000 0.04 ns/op
BenchmarkBcrypt12-4 2000000000 0.16 ns/op
BenchmarkBcrypt13-4 2000000000 0.32 ns/op
BenchmarkBcrypt14-4 1 1281338532 ns/op
BenchmarkBcrypt15-4 1 2558998327 ns/op
PASS
It seems that for a bcrypt with cost of 13 the time it takes is 0.32 nanoseconds, and for cost 14 the time is 1281338532ns or ~1.2 seconds
Which I believe is too much. What do is the best bcrypt cost to use for the current year 2018.
I'm not certain what's going on with Benchmark here. If you just time these, it works fine, and you can work out the right answer for you.
package main
import (
"golang.org/x/crypto/bcrypt"
"time"
)
func main() {
cost := 10
start := time.Now()
bcrypt.GenerateFromPassword([]byte("password"), cost)
end := time.Now()
print(end.Sub(start) / time.Millisecond)
}
For a work factor of 10, on my MacBook Pro I get 78ms. A work factor of 11 is 154ms, and 12 is 334ms. So we're seeing roughly doubling, as expected.
The goal is not a work factor; it's a time. You want as long as you can live with. In my experience (mostly working on client apps), 80-100ms is a nice target because compared to a network request it's undetectable to the user, while being massive in terms of brute-force attacks (so the default of 10 is ideal for my common use).
I generally avoid running password stretching on servers if I can help it, but this scale can be a reasonable trade-off between server impact and security. Remember that attackers may use something dramatically faster than a MacBook Pro, and may use multiple machines in parallel; I pick 80-100ms because of user experience trade-offs. (I perform password stretching on the client when I can get away with it, and then apply a cheap hash like SHA-256 on the server.)
But if you don't do this very often, or can spend more time on it, then longer is of course better, and on my MacBook Pro a work factor of 14 is about 1.2s, which I would certainly accept for some purposes.
But there's a reason that 10 is still the default. It's not an unreasonable value.
Related
EDIT: Turns out this is not easy to reproduce. I think it may be a Host OS or Linux Distro issue. Trying it with different distros in Docker, but running on the same host, produces the same results.
EDIT 2: I altered the code in main.go. Based on the comments saying that it was the GC causing this. Now it should definitely be rewriting the value over the previous one. So there shouldn't be any GC.
Basically there's 2 things I am trying to understand.
The main thing I am trying to understand why these two functions, which do the same thing in the end, are so different in speed. One of them I make the array the size of the number of hashes I want (which uses more memory), and the other one is just a for loop.
How could I speed up the regular for loop function to gain the speed benefit without using the extra memory? (if possible)
Results of Benchmark on my system:
The Array function completes in 3.6 seconds and the loop function takes 6.4 seconds.
go version go1.19.2 linux/amd64
goos: linux
goarch: amd64
pkg: github.com/gngenius02/shardedmapdb
cpu: AMD Ryzen 9 5900X 12-Core Processor
BenchmarkGetHashesArray10Million-24 1 3662003126 ns/op 2080037632 B/op 30000051 allocs/op
BenchmarkGetHashesLoop10Million-24 1 6462627155 ns/op 1920001352 B/op 30000022 allocs/op
PASS
The two functions in question are GetHashUsingArray and GetHashUsingLoop.
main.go:
package main
import (
"crypto/sha256"
"encoding/hex"
)
type HashArray []string
type HS struct {
LastHash string
HashList HashArray
}
func (h *HS) GetHashUsingArray() {
hashit := func(s string) string {
digest := sha256.Sum256([]byte(s))
return hex.EncodeToString(digest[:])
}
hl := h.HashList
for i := 1; i < len(hl); i++ {
(hl)[i] = hashit((hl)[i-1])
}
h.LastHash = hl[len(hl)-1]
}
func GetHashUsingLoop(s string, loops int) string {
hashit := func(s *string) {
digest := sha256.Sum256([]byte(*s))
*s = hex.EncodeToString(digest[:])
}
hash := s
for i := 0; i < loops; i++ {
hashit(&hash)
}
return hash
}
func main() {}
main_test.go:
package main
import (
"testing"
)
func BenchmarkGetHashUsingArray10Million(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
firstValue := "abc"
hs := HS{"", make(HashArray, 10_000_001)}
hs.HashList[0] = firstValue
hs.GetHashUsingArray()
if hs.LastHash != "bf34d93b4be2a313b06cdf9d805c5f3d140abd872c37199701fb1e43fe479923" {
b.Error("Unexpected Result: " + hs.LastHash)
}
}
}
func BenchmarkGetHashUsingLoop10Million(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
firstValue := "abc"
result := GetHashUsingLoop(firstValue, 10_000_000)
if result != "bf34d93b4be2a313b06cdf9d805c5f3d140abd872c37199701fb1e43fe479923" {
b.Error("Unexpected result: " + result)
}
}
}
I think it's somehow relates to either to your own hardware or go version. On my machine results are completely different:
go version go1.18.1 darwin/arm64
% go test -bench=. -v
goos: darwin
goarch: arm64
pkg: github.com/foo/bar
BenchmarkGetHashUsingArray10Million
BenchmarkGetHashUsingArray10Million-8 1 1658941791 ns/op 2080012144 B/op 30000012 allocs/op
BenchmarkGetHashUsingLoop10Million
BenchmarkGetHashUsingLoop10Million-8 1 1391175042 ns/op 1920005816 B/op 30000063 allocs/op
PASS
maybe worth checking with go 1.18 to narrow the scope. I.e. if it will be the same difference with go 1.19 then it's hardware. If difference is not that huge then it's something that was introduced in go 1.19
This question already has answers here:
Why this repeats the same random number?
(3 answers)
Closed 1 year ago.
I have a code:
package main
import ("fmt"
"math/rand"
"strconv"
)
func main() {
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
fmt.Println(strconv.Itoa(rand.Int()))
}
When I run it (go run code.go), I get every time same values:
5577006791947779410
8674665223082153551
6129484611666145821
4037200794235010051
3916589616287113937
6334824724549167320
605394647632969758
1443635317331776148
894385949183117216
2775422040480279449
4751997750760398084
7504504064263669287
1976235410884491574
3510942875414458836
Second try:
package main
import ("fmt"
"math/rand"
"strconv"
)
func main() {
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
fmt.Println(strconv.Itoa(rand.Intn(100)))
}
Same behaviour. Every time's
81
87
47
59
81
18
25
Is this a joke? Why it happens?
Here is no description about non-random same result.
I can see only pseudo-random term without explanation what that means.
Looks like even bash is more logical and stable...
This is a C way
You need to seed it. It says right in the docs
Random numbers are generated by a Source. Top-level functions, such as Float64 and Int, use a default shared Source that produces a deterministic sequence of values each time a program is run. Use the Seed function to initialize the default Source if different behavior is required for each run.
Generally
rand.Seed(time.Now().UnixNano())
You don't appear to have called Seed for math/rand before using the generator.
If Seed is not called, the generator behaves as if seeded by Seed(1). That is not a joke - actually for a PRNG to be deterministic and repeatable is desirable in many cases.
For different numbers, seed with a different value, such as time.Now().UnixNano().
Addition to the other answers, good explanation found about this in medium article go-how-are-random-numbers-generated
Go implements two packages to generate random numbers:
a pseudo-random number generator (PRNG) in the package math/rand cryptographic
pseudorandom number generator (CPRNG), implemented in crypto/rand
If both generate random numbers, your choice will be based on a
tradeoff between genuinely random numbers and performance.
As other answers explain, math/rand packages populate random numbers reading from a source. So if you need random numbers, you need to set random seed calling rand.Seed()
In other case you can use crypto/rand package to generate random numbers. It generates random numbers which can not be deterministic. But performance is bit lower than math/rand package.
I have added sample code below for that. you can run and see different out put here.
package main
import (
"crypto/rand"
"math/big"
)
func main() {
for i := 0; i < 4; i++ {
n, _ := rand.Int(rand.Reader, big.NewInt(100))
println(n.Int64())
}
}
Is it good practice (at least general practice) to have log.SetFlags(log.LstdFlags | log.Lshortfile) in production in Go? I wonder if there is whether performance or security issue by doing it in production. Since it is not default setting of log package in Go. Still can't find any official reference or even opinion article regarding that matter.
As for the performance. Yes, it has an impact, however, it is imho negligible for various reasons.
Testing
Code
package main
import (
"io/ioutil"
"log"
"testing"
)
func BenchmarkStdLog(b *testing.B) {
// We do not want to benchmark the shell
stdlog := log.New(ioutil.Discard, "", log.LstdFlags)
for i := 0; i < b.N; i++ {
stdlog.Println("foo")
}
}
func BenchmarkShortfile(b *testing.B) {
slog := log.New(ioutil.Discard, "", log.LstdFlags|log.Lshortfile)
for i := 0; i < b.N; i++ {
slog.Println("foo")
}
}
Result
goos: darwin
goarch: amd64
pkg: stackoverflow.com/go/logbench
BenchmarkStdLog-4 3803840 277 ns/op 4 B/op 1 allocs/op
BenchmarkShortfile-4 1000000 1008 ns/op 224 B/op 3 allocs/op
Your mileage may vary, but the order of magnitude should be roughly equal.
Why I think the impact is negligible
It is unlikely that your logging will be the bottleneck of your application, unless you write a shitload of logs. In 99 times out of 100, it is not the logging which is the bottleneck.
Get your application up and running, load test and profile it. You can still optimize then.
Hint: make sure you can scale out.
Assuming you had 80 bytes of data and only the last 4 bytes was constantly changing, how would you efficiently hash the total 80 bytes using Go. In essence, the first 76 bytes are the same, while the last 4 bytes keeps changing. Ideally, you want to keep a copy of the hash digest for the first 76 bytes and just keep changing the last 4.
You can try the following examples on the Go Playground. Benchmark results is at the end.
Note: the implementations below are not safe for concurrent use; I intentionally made them like this to be simpler and faster.
Fastest when using only public API (always hashes all input)
The general concept and interface of Go's hash algorithms is the hash.Hash interface. This does not allow you to save the state of the hasher and to return or rewind to the saved state. So using the public hash APIs of the Go standard lib, you always have to calculate the hash from start.
What the public API offers is to reuse an already constructed hasher to calculate the hash of a new input, using the Hash.Reset() method. This is nice so that no (memory) allocations will be needed to calculate multiple hash values. Also you may take advantage of the optional slice that may be passed to Hash.Sum() which is used to append the current hash to. This is nice so that no allocations will be needed to receive the hash results either.
Here's an example that takes advantage of these:
type Cached1 struct {
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached1() *Cached1 {
return &Cached1{hasher: sha256.New()}
}
func (c *Cached1) Sum(data []byte) []byte {
c.hasher.Reset()
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Test data
We'll use the following test data:
var fixed = bytes.Repeat([]byte{1}, 76)
var variantA = []byte{1, 1, 1, 1}
var variantB = []byte{2, 2, 2, 2}
var data = append(append([]byte{}, fixed...), variantA...)
var data2 = append(append([]byte{}, fixed...), variantB...)
var c1 = NewCached1()
First let's get authentic results (to verify if our hasher works correctly):
fmt.Printf("%x\n", sha256.Sum256(data))
fmt.Printf("%x\n", sha256.Sum256(data2))
Output:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Now let's check our Cached1 hasher:
fmt.Printf("%x\n", c1.Sum(data))
fmt.Printf("%x\n", c1.Sum(data2))
Output is the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
Even faster but may break (in future Go releases): hashes only the last 4 bytes
Now let's see a less flexible solution which truly calculates the hash of the first 76 fixed part only once.
The hasher of the crypto/sha256 package is the unexported sha256.digest type (more precisely a pointer to this type):
// digest represents the partial evaluation of a checksum.
type digest struct {
h [8]uint32
x [chunk]byte
nx int
len uint64
is224 bool // mark if this digest is SHA-224
}
A value of the digest struct type basically holds the current state of the hasher.
What we may do is feed the hasher the fixed, first 76 bytes, and then save this struct value. When we need to caclulate the hash of some 80 bytes data where the first 76 is the same, we use this saved value as a starting point, and then feed the varying last 4 bytes.
Note that it's enough to simply save this struct value as it contains no pointers and no descriptor types like slices and maps. Else we would also have to make a copy of those, but we're "lucky". So this solution would need adjustment if a future implementation of crypto/sha256 would add a pointer or slice field for example.
Since sha256.digest is unexported, we can only use reflection (reflect package) to achieve our goals, which inherently will add some delays to computation.
Example implementation that does this:
type Cached2 struct {
origv reflect.Value
hasherv reflect.Value
hasher hash.Hash
result [sha256.Size]byte
}
func NewCached2(fixed []byte) *Cached2 {
h := sha256.New()
h.Write(fixed)
c := &Cached2{origv: reflect.ValueOf(h).Elem()}
hasherv := reflect.New(c.origv.Type())
c.hasher = hasherv.Interface().(hash.Hash)
c.hasherv = hasherv.Elem()
return c
}
func (c *Cached2) Sum(data []byte) []byte {
// Set state of the fixed hash:
c.hasherv.Set(c.origv)
c.hasher.Write(data)
return c.hasher.Sum(c.result[:0])
}
Testing it:
var c2 = NewCached2(fixed)
fmt.Printf("%x\n", c2.Sum(variantA))
fmt.Printf("%x\n", c2.Sum(variantB))
Output is again the same:
fb8e69bdfa2ad15be7cc8a346b74e773d059f96cfc92da89e631895422fe966a
10ef52823dad5d1212e8ac83b54c001bfb9a03dc0c7c3c83246fb988aa788c0c
So it works.
The "ultimate", fastest solution
Cached2 could be faster if reflection would not be involved. If we want an even faster solution, simply we can make a copy of the sha256.digest type and its methods into our package, so we can directly use it without having to resort to reflection.
If we do this, we will have access to the digest struct value, and we can simply make a copy of it like:
var d digest
// init d
saved := d
And restoring it is like:
d = saved
I simply "cloned" the crypto/sha256 package to my workspace, and changed / exported the digest type as Digest just for demonstration purposes. Then using this mysha256.Digest type I implemented Cached3 like this:
type Cached3 struct {
orig mysha256.Digest
result [sha256.Size]byte
}
func NewCached3(fixed []byte) *Cached3 {
var d mysha256.Digest
d.Reset()
d.Write(fixed)
return &Cached3{orig: d}
}
func (c *Cached3) Sum(data []byte) []byte {
// Make a copy of the fixed hash:
d := c.orig
d.Write(data)
return d.Sum(c.result[:0])
}
Testing it:
var c3 = NewCached3(fixed)
fmt.Printf("%x\n", c3.Sum(variantA))
fmt.Printf("%x\n", c3.Sum(variantB))
Output again is the same. So this works too.
Benchmarks
We can benchmark performance with this code:
func BenchmarkCached1(b *testing.B) {
for i := 0; i < b.N; i++ {
c1.Sum(data)
c1.Sum(data2)
}
}
func BenchmarkCached2(b *testing.B) {
for i := 0; i < b.N; i++ {
c2.Sum(variantA)
c2.Sum(variantB)
}
}
func BenchmarkCached3(b *testing.B) {
for i := 0; i < b.N; i++ {
c3.Sum(variantA)
c3.Sum(variantB)
}
}
Benchmark results (go test -bench . -benchmem):
BenchmarkCached1-4 1000000 1569 ns/op 0 B/op 0 allocs/op
BenchmarkCached2-4 2000000 926 ns/op 0 B/op 0 allocs/op
BenchmarkCached3-4 2000000 872 ns/op 0 B/op 0 allocs/op
Cached2 is approximately 41% faster than Cached1 which is quite noticable and nice. Cached3 only gives a "little" performance boost compared to Cached2, another 6%. Cached3 is 44% faster than Cached1.
Also note that none of the solutions use any allocations which is also nice.
Conclusion
For that extra 40% or 44%, I would probably not go for the Cached2 or Cached3 solutions. Of course it really depends on how important the performance is to you. If it is important, I think the Cached2 solution presents a fine compromise between minimum added complexity and the noticeable performance gain. It does pose a threat as future Go implementations may break it; if it is a problem, Cached3 solves this by copying the current implementation (and also improves its performance a little).
I have 2 methods to trim the domain suffix from a subdomain and I'd like to find out which one is faster. How do I do that?
2 string trimming methods
You can use the builtin benchmark capabilities of go test.
For example (on play):
import (
"strings"
"testing"
)
func BenchmarkStrip1(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
s := strings.Index(host, ".")
_ = host[:s]
}
}
func BenchmarkStrip2(b *testing.B) {
for br := 0; br < b.N; br++ {
host := "subdomain.domain.tld"
strings.TrimSuffix(host, ".domain.tld")
}
}
Store this code in somename_test.go and run go test -test.bench='.*'. For me this gives
the following output:
% go test -test.bench='.*'
testing: warning: no tests to run
PASS
BenchmarkStrip1 100000000 12.9 ns/op
BenchmarkStrip2 100000000 16.1 ns/op
ok 21614966 2.935s
The benchmark utility will attempt to do a certain number of runs until a meaningful time is
measured which is reflected in the output by the number 100000000. The code was run
100000000 times and each operation in the loop took 12.9 ns and 16.1 ns respectively.
So you can conclude that the code in BenchmarkStrip1 performed better.
Regardless of the outcome, it is often better to profile your program to see where the
real bottleneck is instead of wasting your time with micro benchmarks like these.
I would also not recommend writing your own benchmarking as there are some factors you might
not consider such as the garbage collector and running your samples long enough.