How to check if []byte is all zeros in go - go

Is there a way to check if a byte slice is empty or 0 without checking each element or using reflect?
theByteVar := make([]byte, 128)
if "theByteVar is empty or zeroes" {
doSomething()
}
One solution which seems weird that I found was to keep an empty byte array for comparison.
theByteVar := make([]byte, 128)
emptyByteVar := make([]byte, 128)
// fill with anything
theByteVar[1] = 2
if reflect.DeepEqual(theByteVar,empty) == false {
doSomething(theByteVar)
}
For sure there must be a better/quicker solution.
Thanks
UPDATE did some comparison for 1000 loops and the reflect way is the worst by far...
Equal Loops: 1000 in true in 19.197µs
Contains Loops: 1000 in true in 34.507µs
AllZero Loops: 1000 in true in 117.275µs
Reflect Loops: 1000 in true in 14.616277ms

Comparing it with another slice containing only zeros, that requires reading (and comparing) 2 slices.
Using a single for loop will be more efficient here:
for _, v := range theByteVar {
if v != 0 {
doSomething(theByteVar)
break
}
}
If you do need to use it in multiple places, wrap it in a utility function:
func allZero(s []byte) bool {
for _, v := range s {
if v != 0 {
return false
}
}
return true
}
And then using it:
if !allZero(theByteVar) {
doSomething(theByteVar)
}

Another solution borrows an idea from C. It could be achieved by using the unsafe package in Go.
The idea is simple, instead of checking each byte from []byte, we can check the value of byte[i:i+8], which is a uint64 value, in each steps. By doing this we can check 8 bytes instead of checking only one byte in each iteration.
Below codes are not best practice but only show the idea.
const (
len8 int = 0xFFFFFFF8
)
func IsAllBytesZero(data []byte) bool {
n := len(data)
// Magic to get largest length which could be divided by 8.
nlen8 := n & len8
i := 0
for ; i < nlen8; i += 8 {
b := *(*uint64)(unsafe.Pointer(uintptr(unsafe.Pointer(&data[0])) + 8*uintptr(i)))
if b != 0 {
return false
}
}
for ; i < n; i++ {
if data[i] != 0 {
return false
}
}
return true
}
Benchmark
Testcases:
Only test for worst cases (all elements are zero)
Methods:
IsAllBytesZero: unsafe package solution
NaiveCheckAllBytesAreZero: a loop to iterate the whole byte array and check it.
CompareAllBytesWithFixedEmptyArray: using bytes.Compare solution with pre-allocated fixed size empty byte array.
CompareAllBytesWithDynamicEmptyArray: using bytes.Compare solution without pre-allocated fixed size empty byte array.
Results
BenchmarkIsAllBytesZero10-8 254072224 4.68 ns/op
BenchmarkIsAllBytesZero100-8 132266841 9.09 ns/op
BenchmarkIsAllBytesZero1000-8 19989015 55.6 ns/op
BenchmarkIsAllBytesZero10000-8 2344436 507 ns/op
BenchmarkIsAllBytesZero100000-8 1727826 679 ns/op
BenchmarkNaiveCheckAllBytesAreZero10-8 234153582 5.15 ns/op
BenchmarkNaiveCheckAllBytesAreZero100-8 30038720 38.2 ns/op
BenchmarkNaiveCheckAllBytesAreZero1000-8 4300405 291 ns/op
BenchmarkNaiveCheckAllBytesAreZero10000-8 407547 2666 ns/op
BenchmarkNaiveCheckAllBytesAreZero100000-8 43382 27265 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10-8 415171356 2.71 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100-8 218871330 5.51 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray1000-8 56569351 21.0 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray10000-8 6592575 177 ns/op
BenchmarkCompareAllBytesWithFixedEmptyArray100000-8 567784 2104 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10-8 64215448 19.8 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100-8 32875428 35.4 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray1000-8 8580890 140 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray10000-8 1277070 938 ns/op
BenchmarkCompareAllBytesWithDynamicEmptyArray100000-8 121256 10355 ns/op
Summary
Assumed that we're talking about the condition in sparse zero byte array. According to the benchmark, if performance is an issue, the naive check solution would be a bad idea. And, if you don't want to use unsafe package in your project, then consider using bytes.Compare solution with pre-allocated empty array as an alternative.
An interesting point could be pointed out is that the performance comes from unsafe package varies a lot, but it basically outperform all other solution mentioned above. I think it was relevant to the CPU cache mechanism.

You can possibly use bytes.Equal or bytes.Contains to compare with a zero initialized byte slice, see https://play.golang.org/p/mvUXaTwKjP, I haven't checked for performance, but hopefully it's been optimized. You might want to try out other solutions and compare the performance numbers, if needed.

I think it is better (faster) if binary or is used instead of if condition inside loop:
func isZero(bytes []byte) bool {
b := byte(0)
for _, s := range bytes {
b |= s
}
return b == 0
}
One can optimize this even more by using idea with uint64 mentioned in previous answers

Related

String reverse with []rune and strings.Builder slower than same with byte array [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 3 months ago.
Improve this question
I tried/benched three ways to reverse a string,
With Runes and in place swap. suggested multiple places like here
func ReverseWithRune(str string) string {
runes := []rune(str)
for i, j := len(runes)-1, 0; j < i; i, j = i-1, j+1 {
runes[i], runes[j] = runes[j], runes[i]
}
return string(runes)
}
With strings.Builder
func ReverseWithBuilder(str string) string {
var ret strings.Builder
strLen := len(str)
ret.Grow(strLen)
for i := strLen - 1; i >= 0; i-- {
ret.WriteByte(str[i])
}
return ret.String()
}
With a byte array fill up from end of input string
func ReverseWithByteArray(str string) string {
strLen := len(str)
ret := make([]byte, strLen)
for i := strLen - 1; i >= 0; i-- {
ret[strLen-i-1] = str[i]
}
return string(ret)
}
I thought ReverseWithRune and ReverseWithBuilder should be faster than ReverseWithByteArray. No-copy, lesser allocations etc. But the benchmarks are other way around for small 9 or large 99999 length strings ReverseWithByteArray with byte array is always faster.
RuneArr_9-4 9545110 111.3 ns/op 16 B/op 1 allocs/op
StringBuil_9-4 24685213 40.79 ns/op 16 B/op 1 allocs/op
ByteArr_9-4 23045233 52.35 ns/op 32 B/op 2 allocs/op
Rune_99999-4 1110 1002334 ns/op 507904 B/op 2 allocs/op
StringBuil_99999-4 6679 179333 ns/op 106496 B/op 1 allocs/op
ByteArr_99999-4 12200 97876 ns/op 212992 B/op 2 allocs/op
My question is
Why is rune and builder approach not faster despite lesser iterations(lenght/2) and in place etc.
Obvious another, can rune/builder appraoches be improved? May be I am using them wrong here.
Details
goos: linux goarch: amd64
cpu: Intel(R) Core(TM) i5-7200U CPU #2.50GHz
go version: go1.19.2 linux/amd64
I tried to look into profile and it shows slicerunetostring and strings.Builder.WriteBytes the main heavy operations.
Bytes reverse is also large but also because it has more ops.
It is not so relevant which one is faster, since two out of three are incorrect. Only ReverseWithRune is correct.
The others manipulate the individual bytes in the string. In Go, strings are UTF-8 encoded, and this is a multibyte encoding. Characters with a codepoint outside the ASCII range are encoded using more than 1 bytes, and if you reverse them byte-by-byte, you break that character as the multibyte encoding in reverse doesn't mean the same as in forward direction.
You can check it easily on the Go playground, use the string "Hello, 世界" that is provided in their standard example:
https://go.dev/play/p/oYfPsO-C_OR
Output:
ReverseWithByteArray ��疸� ,olleH
ReverseWithBuilder ��疸� ,olleH
ReverseWithRune 界世 ,olleH
As to why ReverseWithRune is slower: it involves creating a slice of rune (a 32-bit integer), then copying the string into it, while decoding the character boundaries from the UTF-8 encoding. Then you reverse it, and after that, allocate a new byte array, then encoding a slice of rune as UTF-8.

Which is the most efficient nil value?

Hi while doing some exercises I've came across this question...
Lets say you have a map with the capacity of 100,000.
Which value is the most efficient to fill the whole map in the least amount of time?
I've ran some benchmarks on my own trying out most of the types I could think of and the resulting top list is:
Benchmark_Struct-8 200 6010422 ns/op (struct{}{})
Benchmark_Byte-8 200 6167230 ns/op (byte = 0)
Benchmark_Int-8 200 6112927 ns/op (int8 = 0)
Benchmark_Bool-8 200 6117155 ns/op (bool = false)
Example function:
func Struct() {
m := make(map[int]struct{}, 100000)
for i := 0; i < 100000; i++ {
m[i] = struct{}{}
}
}
As you can see the fastest one (most of the time) is type struct{}{} - empty struct.
But why is this the case in go?
Is there a faster/lighter nil or non-nil value?
- Thank you for your time :)
Theoretically, struct{}{} should be the most efficient because it requires no memory. In practice, a) results may vary between Go versions, operating systems, and system architectures; and b) I can't think of any case where maximizing the execution-time efficiency of empty values is relevant.

Compare string and byte slice in Go without copy

What is the best way to check that Go string and a byte slice contain the same bytes? The simplest str == string(byteSlice) is inefficient as it copies byteSlice first.
I was looking for a version of Equal(a, b []byte) that takes a string as its argument, but could not find anything suitable.
Starting from Go 1.5 the compiler optimizes string(bytes) when comparing to a string using a stack-allocated temporary. Thus since Go 1.5
str == string(byteSlice)
became a canonical and efficient way to compare string to a byte slice.
The Go Programming Language Specification
String types
A string type represents the set of string values. A string value is a
(possibly empty) sequence of bytes. The predeclared string type is
string.
The length of a string s (its size in bytes) can be discovered using
the built-in function len. A string's bytes can be accessed by integer
indices 0 through len(s)-1.
For example,
package main
import "fmt"
func equal(s string, b []byte) bool {
if len(s) != len(b) {
return false
}
for i, x := range b {
if x != s[i] {
return false
}
}
return true
}
func main() {
s := "equal"
b := []byte(s)
fmt.Println(equal(s, b))
s = "not" + s
fmt.Println(equal(s, b))
}
Output:
true
false
If you're comfortable enough with the fact that this can break on a later release (doubtful though), you can use unsafe:
func unsafeCompare(a string, b []byte) int {
abp := *(*[]byte)(unsafe.Pointer(&a))
return bytes.Compare(abp, b)
}
func unsafeEqual(a string, b []byte) bool {
bbp := *(*string)(unsafe.Pointer(&b))
return a == bbp
}
playground
Benchmarks:
// using:
// aaa = strings.Repeat("a", 100)
// bbb = []byte(strings.Repeat("a", 99) + "b")
// go 1.5
BenchmarkCopy-8 20000000 75.4 ns/op
BenchmarkPetersEqual-8 20000000 83.1 ns/op
BenchmarkUnsafe-8 100000000 12.2 ns/op
BenchmarkUnsafeEqual-8 200000000 8.94 ns/op
// go 1.4
BenchmarkCopy 10000000 233 ns/op
BenchmarkPetersEqual 20000000 72.3 ns/op
BenchmarkUnsafe 100000000 15.5 ns/op
BenchmarkUnsafeEqual 100000000 10.7 ns/op
There is no reason to use the unsafe package or something just to compare []byte and string. The Go compiler is clever enough now, and it can optimize such conversions.
Here's a benchmark:
BenchmarkEqual-8 172135624 6.96 ns/op <--
BenchmarkUnsafe-8 179866616 6.65 ns/op <--
BenchmarkUnsafeEqual-8 175588575 6.85 ns/op <--
BenchmarkCopy-8 23715144 47.3 ns/op
BenchmarkPetersEqual-8 24709376 47.3 ns/op
Just convert a byte slice to a string and compare:
var (
aaa = strings.Repeat("a", 100)
bbb = []byte(strings.Repeat("a", 99) + "b")
)
func BenchmarkEqual(b *testing.B) {
for i := 0; i < b.N; i++ {
_ = aaa == string(bbb)
}
}
👉 Here is more information about the optimization, and this.

Different performances in Go slices resize

I'm spending some time experimenting with Go's internals and I ended up writing my own implementation of a stack using slices.
As correctly pointed out by a reddit user in this post and as outlined by another user in this SO answer Go already tries to optimise slices resize.
Turns out, though, that I rather have performance gains using my own implementation of slice growing rather than sticking with the default one.
This is the structure I use for holding the stack:
type Stack struct {
slice []interface{}
blockSize int
}
const s_DefaultAllocBlockSize = 20;
This is my own implementation of the Push method
func (s *Stack) Push(elem interface{}) {
if len(s.slice) + 1 == cap(s.slice) {
slice := make([]interface{}, 0, len(s.slice) + s.blockSize)
copy(slice, s.slice)
s.slice = slice
}
s.slice = append(s.slice, elem)
}
This is a plain implementation
func (s *Stack) Push(elem interface{}) {
s.slice = append(s.slice, elem)
}
Running the benchmarks I've implemented using Go's testing package my own implementation performs this way:
Benchmark_PushDefaultStack 20000000 87.7 ns/op 24 B/op 1 allocs/op
While relying on the plain append the results are the following
Benchmark_PushDefaultStack 10000000 209 ns/op 90 B/op 1 allocs/op
The machine I run tests on is an early 2011 Mac Book Pro, 2.3 GHz Intel Core i5 with 8GB of RAM 1333MHz DDR3
EDIT
The actual question is: is my implementation really faster than the default append behavior? Or am I not taking something into account?
Reading your code, tests, benchmarks, and results it's easy to see that they are flawed. A full code review is beyond the scope of StackOverflow.
One specific bug.
// Push pushes a new element to the stack
func (s *Stack) Push(elem interface{}) {
if len(s.slice)+1 == cap(s.slice) {
slice := make([]interface{}, 0, len(s.slice)+s.blockSize)
copy(slice, s.slice)
s.slice = slice
}
s.slice = append(s.slice, elem)
}
Should be
// Push pushes a new element to the stack
func (s *Stack) Push(elem interface{}) {
if len(s.slice)+1 == cap(s.slice) {
slice := make([]interface{}, len(s.slice), len(s.slice)+s.blockSize)
copy(slice, s.slice)
s.slice = slice
}
s.slice = append(s.slice, elem)
}
copying
slices
The function copy copies slice elements from a source src to a
destination dst and returns the number of elements copied. The
number of elements copied is the minimum of len(src) and len(dst).
You copied 0, you should have copied len(s.slice).
As expected, your Push algorithm is inordinately slow:
append:
Benchmark_PushDefaultStack-4 2000000 941 ns/op 49 B/op 1 allocs/op
alediaferia:
Benchmark_PushDefaultStack-4 100000 1246315 ns/op 42355 B/op 1 allocs/op
This how append works: append complexity.
There are other things wrong too. Your benchmark results are often not valid.
I believe your example is faster because you have a fairly small data set and are allocating with an initial capacity of 0. In your version of append you preempt a large amount of allocations by growing the block size more dramatically early (by 20) circumventing the (in this case) expensive reallocs that take you through all those trivially small capacities 0,1,2,4,8,16,32,64 ect
If your data sets were a lot larger this would likely be marginalized by the cost of large copies. I've seen a lot of misuse of slice in Go. The clear performance win is had by making your slice with a reasonable default capacity.

Minimum value of set in idiomatic Go

How do I write a function that returns the minimum value of a set in go? I am not just looking for a solution (I know I could just initialize the min value when iterating over the first element and then set a boolean variable that I initialized the min value) but rather an idiomatic solution. Since go doesn't have native sets, assume we have a map[Cell]bool.
Maps are the idiomatic way to implement sets in Go. Idiomatic code uses either bool or struct{} as the map's value type. The latter uses less storage, but requires a little more typing at the keyboard to use.
Assuming that the maximum value for a cell is maxCell, then this function will compute the min:
func min(m map[Cell]bool) Cell {
min := maxCell
for k := range m {
if k < min {
min = k
}
}
return min
}
If Cell is a numeric type, then maxCell can be set to one of the math constants.
Any solution using a map will require a loop over the keys.
You can keep a heap in addition to the map to find a minimum. This will require more storage and code, but can be more efficient depending on the size of the set and how often the minimum function is called.
A different approach and depending on how big your set is, using a self-sorting-slice can be more efficient:
type Cell uint64
type CellSet struct {
cells []Cell
}
func (cs *CellSet) Len() int {
return len(cs.cells)
}
func (cs *CellSet) Swap(i, j int) {
cs.cells[i], cs.cells[j] = cs.cells[j], cs.cells[i]
}
func (cs *CellSet) Less(i, j int) bool {
return cs.cells[i] < cs.cells[j]
}
func (cs *CellSet) Add(c Cell) {
for _, v := range cs.cells {
if v == c {
return
}
}
cs.cells = append(cs.cells, c)
sort.Sort(cs)
}
func (cs *CellSet) Min() Cell {
if cs.Len() > 0 {
return cs.cells[0]
}
return 0
}
func (cs *CellSet) Max() Cell {
if l := cs.Len(); l > 0 {
return cs.cells[l-1]
}
return ^Cell(0)
}
playground // this is a test file, copy it to set_test.go and run go test -bench=. -benchmem -v
BenchmarkSlice 20 75385089 ns/op 104 B/op 0 allocs/op
BenchmarkMap 20 77541424 ns/op 158 B/op 0 allocs/op
BenchmarkSliceAndMin 20 77155563 ns/op 104 B/op 0 allocs/op
BenchmarkMapAndMin 1 1827782378 ns/op 2976 B/op 8 allocs/op

Resources