The bytes.Buffer object has a Truncate(n int) method to discard all but the first n bytes.
I'd need the exact inverse of that - keeping the last n bytes.
I could do the following
b := buf.Bytes()
buf.Reset()
buf.Write(b[offset:])
but I'm not sure if this will re-use the slice efficiently.
Are there better options?
There are two alternatives:
The solution you give, which allows the first 'offset' bytes to be reused.
Create a bytes.NewBuffer(b[offset:]) and use that. This will not allow the first 'offset' bytes to be collected until you're done with the new buffer, but it avoids the cost of copying.
Let bytes.Buffer handle the buffer management. The internal grow method slides the data down. Use the Next method. For example,
package main
import (
"bytes"
"fmt"
)
func main() {
var buf bytes.Buffer
for i := 0; i < 8; i++ {
buf.WriteByte(byte(i))
}
fmt.Println(buf.Len(), buf.Bytes())
n := buf.Len() / 2
// Keep last n bytes.
if n > buf.Len() {
n = buf.Len()
}
buf.Next(buf.Len() - n)
fmt.Println(buf.Len(), buf.Bytes())
}
Output:
8 [0 1 2 3 4 5 6 7]
4 [4 5 6 7]
I reckon the problem with your idea is that "truncating the buffer from its start" is impossible simply because the memory allocator allocates memory in full chunks and there's no machinery in it to split an already allocated chunk into a set of "sub chunks" — essentially what you're asking for. So to support "trimming from the beginning" the implementation of bytes.Buffer would have to allocate a smaller buffer, move the "tail" there and then mark the original buffer for reuse.
This naturally leads us to another idea: use two (or more) buffers. They might either be allocated separately and treated as adjacent by your algorythms or you might use custom allocation: allocate one big slice and then reslice it twice or more times to produce several physically adjacent buffers, or slide one or more "window" slices over it. This means implementing a custom data structure of course…
Related
In Golang, we can use the builtin make() function to create a slice with a given initial length and capacity.
Consider the following lines, the slice's length is set to 1, and its capacity 3:
func main() {
var slice = make([]int, 1, 3)
slice[0] = 1
slice = append(slice, 6, 0, 2, 4, 3, 1)
fmt.Println(slice)
}
I was surprised to see that this program prints:
[1 6 0 2 4 3 1]
This got me wondering- what is the point of initially defining a slice's capacity if append() can simply blow past it? Are there performance gains for setting a sufficiently large capacity?
A slice is really just a fancy way to manage an underlying array. It automatically tracks size, and re-allocates new space as needed.
As you append to a slice, the runtime doubles its capacity every time it exceeds its current capacity. It has to copy all of the elements to do that. If you know how big it will be before you start, you can avoid a few copy operations and memory allocations by grabbing it all up front.
When you make a slice providing capacity, you set the initial capacity, not any kind of limit.
See this blog post on slices for some interesting internal details of slices.
A slice is a wonderful abstraction of a simple array. You get all sorts of nice features, but deep down at its core, lies an array. (I explain the following in reverse order for a reason). Therefore, if/when you specify a capacity of 3, deep down, an array of length 3 is allocated in memory, which you can append up to without having it need to reallocate memory. This attribute is optional in the make command, but note that a slice will always have a capacity whether or not you choose to specify one. If you specify a length (which always exists as well), the slice be indexable up to that length. The rest of the capacity is hidden away behind the scenes so it does not have to allocate an entirely new array when append is used.
Here is an example to better explain the mechanics.
s := make([]int, 1, 3)
The underlying array will be allocated with 3 of the zero value of int (which is 0):
[0,0,0]
However, the length is set to 1, so the slice itself will only print [0], and if you try to index the second or third value, it will panic, as the slice's mechanics do not allow it. If you s = append(s, 1) to it, you will find that it has actually been created to contain zero values up to the length, and you will end up with [0,1]. At this point, you can append once more before the entire underlying array is filled, and another append will force it to allocate a new one and copy all the values over with a doubled capacity. This is actually a rather expensive operation.
Therefore the short answer to your question is that preallocating the capacity can be used to vastly improve the efficiency of your code. Especially so if the slice is either going to end up very large, or contains complex structs (or both), as the zero value of a struct is effectively the zero values of every single one of its fields. This is not because it would avoid allocating those values, as it has to anyway, but because append would have to reallocate new arrays full of these zero values each time it would need to resize the underlying array.
Short playground example: https://play.golang.org/p/LGAYVlw-jr
As others have already said, using the cap parameter can avoid unnecessary allocations. To give a sense of the performance difference, imagine you have a []float64 of random values and want a new slice that filters out values that are not above, say, 0.5.
Naive approach - no len or cap param
func filter(input []float64) []float64 {
ret := make([]float64, 0)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Better approach - using cap param
func filterCap(input []float64) []float64 {
ret := make([]float64, 0, len(input))
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Benchmarks (n=10)
filter 131 ns/op 56 B/op 3 allocs/op
filterCap 56 ns/op 80 B/op 1 allocs/op
Using cap made the program 2x+ faster and reduced the number of allocations from 3 to 1. Now what happens at scale?
Benchmarks (n=1,000,000)
filter 9630341 ns/op 23004421 B/op 37 allocs/op
filterCap 6906778 ns/op 8003584 B/op 1 allocs/op
The speed difference is still significant (~1.4x) thanks to 36 fewer calls to runtime.makeslice. However, the bigger difference is the memory allocation (~4x less).
Even better - calibrating the cap
You may have noticed in the first benchmark that cap makes the overall memory allocation worse (80B vs 56B). This is because you allocate 10 slots but only need, on average, 5 of them. This is why you don't want to set cap unnecessarily high. Given what you know about your program, you may be able to calibrate the capacity. In this case, we can estimate that our filtered slice will need 50% as many slots as the original slice.
func filterCalibratedCap(input []float64) []float64 {
ret := make([]float64, 0, len(input)/2)
for _, el := range input {
if el > .5 {
ret = append(ret, el)
}
}
return ret
}
Unsurprisingly, this calibrated cap allocates 50% as much memory as its predecessor, so that's ~8x improvement on the naive implementation at 1m elements.
Another option - using direct access instead of append
If you are looking to shave even more time off a program like this, initialize with the len parameter (and ignore the cap parameter), access the new slice directly instead of using append, then throw away all the slots you don't need.
func filterLen(input []float64) []float64 {
ret := make([]float64, len(input))
var counter int
for _, el := range input {
if el > .5 {
ret[counter] = el
counter++
}
}
return ret[:counter]
}
This is ~10% faster than filterCap at scale. However, in addition to being more complicated, this pattern does not provide the same safety as cap if you try and calibrate the memory requirement.
With cap calibration, if you underestimate the total capacity required, then the program will automatically allocate more when it needs it.
With this approach, if you underestimate the total len required, the program will fail. In this example, if you initialize as ret := make([]float64, len(input)/2), and it turns out that len(output) > len(input)/2, then at some point the program will try to access a non-existent slot and panic.
Each time you add an item to a slice that has len(mySlice) == cap(mySlice), the underlying data structure is replaced with a larger structure.
fmt.Printf("Original Capacity: %v", cap(mySlice)) // Output: 8
mySlice = append(mySlice, myNewItem)
fmt.Printf("New Capacity: %v", cap(mySlice)) // Output: 16
Here, mySlice is replaced (through the assignment operator) with a new slice containing all the elements of the original mySlice, plus myNewItem, plus some room (capacity) to grow without triggering this resize.
As you can imagine, this resizing operation is computationally non-trivial.
Quite often, all the resize operations can be avoided if you know how many items you will need to store in mySlice. If you have this foreknowledge, you can set the capacity of the original slice upfront and avoid all the resize operations.
(In practice, it's quite often possible to know how many items will be added to a collection; especially when transforming data from one format to another.)
I'm wondering about best practices when initializing empty arrays.
i.e. Is there any difference here between arr1, arr2, and arr3?
myArr1 := []int{}
myArr2 := make([]int,0)
var myArr3 []int
I know that they make empty []int but I wonder, is one syntax preferable to the others? Personally I find the first to be most readable but that's beside the point here. One key point of contention may be the array capacity, presumably the default capacity is the same between the three as it is unspecified. Is declaring arrays of unspecified capacity "bad"? I can assume it comes with some performance cost but how "bad" is it really?
/tldr:
Is there any difference between the 3 ways to make an empty
array?
What is the default capacity of an array when unspecified?
What is the performance cost of using arrays with unspecified capacity?
First, it's a slice not an array. Arrays and slices in Go are very different, arrays have a fixed size that is part of the type. I had trouble with this at first too :)
Not really. Any if the three is correct, and any difference should be too small to worry about. In my own code I generally use whatever is easiest in a particular case.
0
Nothing, until you need to add an item, then whatever it costs to allocate the storage needed.
What is the performance cost of using arrays with unspecified capacity?
There is certainly a cost when you start populating the slice. If you know how big the slice should grow, you can allocate capacity of the underlying array from the very begging as opposed to reallocating every time the underlying array fills up.
Here is a simple example with timing:
package main
import "fmt"
func main() {
limit := 500 * 1000 * 1000
mySlice := make([]int, 0, limit) //vs mySlice := make([]int, 0)
for i := 0; i < limit; i++ {
mySlice = append(mySlice, i)
}
fmt.Println(len(mySlice))
}
On my machine:
time go run my_file.go
With preallocation:
real 0m2.129s
user 0m2.073s
sys 0m1.357s
Without preallocation
real 0m7.673s
user 0m9.095s
sys 0m3.462s
Is there any difference between the 3 ways to make an empty array?
if empty array means len(array)==0, the answer is no, but actually only myArr3==nil is true.
What is the default capacity of an array when unspecified?
the default capacity will be same with the len you specify.
What is the performance cost of using arrays with unspecified capacity?
none
What is the difference between:
x := make([]int, 5, 10)
x := make([]int, 5)
x := [5]int{}
I know that make allocates an array and returns a slice that refers to that array. I don't understand where it can be used?
I can't find a good example that will clarify the situation.
x := make([]int, 5) Makes slice of int with length 5 and capacity 5 (same as length).
x := make([]int, 5, 10) Makes slice of int with length 5 and capacity 10.
x := [5]int{} Makes array of int with length 5.
Slices
If you need to append more items than capacity of slice using append function, go runtime will allocate new underlying array and copy existing one to it. So if you know about estimated length of your slice, better to use explicit capacity declaration. It will consume more memory for underlying array at the beginning, but safe cpu time for many allocations and array copying.
You can explore how len and cap changes while append, using that simple test on Go playground
Every time when cap value changed, new array allocated
Arrays
Array size is fixed, so if you need to grow array you have to create new one with new length and copy your old array into it by your own.
There are some great articles about slices and arrays in go:
http://blog.golang.org/go-slices-usage-and-internals
http://blog.golang.org/slices
The second line will allocate 10 int's worth memory at the very beginning, but returning you a slice of 5 int's. The second line does not stand less memory, it saves you another memory allocation if you need to expand the slice to anything not more than 10 * load_factor.
I have an array and a slice pointing to it, like shown as follows:
package main
import "fmt"
func main() {
array_str := []string{"0a","1b","2c","3d","4e"}
slice_str:=array_str[1:4]
fmt.Println("Initially :")
fmt.Println("Printing 1 :Array :",array_str)
fmt.Println("Printing 1 :Slice:",slice_str)
//Step 1.Changing Slice and it get reflected in array
fmt.Println("\nAfter Alteration:")
slice_str[0]="alterd_1b"
fmt.Println("Printing 2 :Array :",array_str)
fmt.Println("Printing 2 :Slice:",slice_str)
fmt.Println("len of slice_str:",len(slice_str)," cap of slice_str:",cap(slice_str),"len of array_str:",len(array_str))
//Step 2.appending to slice and it get reflected
slice_str = append(slice_str,"apnded_elemnt")
fmt.Println("\nAfter Apending:")
fmt.Println("Printing 3 :Array :",array_str)//"4e" is replaced with "apnded_elemnt" in array !!
fmt.Println("Printing 3 :Slice:",slice_str)
fmt.Println("len of slice_str:",len(slice_str)," cap of slice_str:",cap(slice_str),"len of array_str:",len(array_str))
//Step 3.Again appending to slice so that lentght of slice is growing further to underlaying array
slice_str = append(slice_str,"outgrown_elemnt")
fmt.Println("\nAfter OUT GROWING:")
fmt.Println("Printing 4 :Array :",array_str)//obeviously out grown elemnt not added to array that is fine
fmt.Println("Printing 4 :Slice:",slice_str)
fmt.Println("len of slice_str:",len(slice_str)," cap of slice_str:",cap(slice_str),"len of array_str:",len(array_str))
//How Capacity Become 8 here ???
//Step 4 .Now Changing Slice element which is in Range of array to verify it reflect on array:
fmt.Println("\nAfter Post out grown Alteration:")
slice_str[0]="again_alterd_1b"
fmt.Println("Printing 2 :Array :",array_str)//Change in slice is not reflectd in array .Why ?
fmt.Println("Printing 2 :Slice:",slice_str)
}
Playground: http://play.golang.org/p/3z52HXHQ7s
Questions:
In Step 3: why does cap of the slice jumped from 4 to 8?
In Step 4: after the slice is out grown, changes to the element of the slice, which is in the range of the array, is not reflected to the array and vice versa. Why is it not happening after it is grown out? What actually happens when the slice grows out?
See here: http://blog.golang.org/slices
Short answers: 1) it grows by doubling (while short). If you append once you might append a second time too and this avoids allocations. 2) That's how slice growing works. An array cannot grow, so a new, larger array is allocated, the old one copied and you are handed a slice pointing to the larger copy.
(The documentation on the golang.org website is really helpful, readable, short and precise. I'd like to recommend to look at golang.org first before asking here.)
The capacity is multiplied by 2 because it is less consuming. Indeed, memory allocation is very consuming, and it's better to allocate a lot of memory a single time than exactly what is needed every time.
Let's just compare, first with a simple example using the concatenation: every time, Go allocates just what is needed.
var n = time.Now()
var s = ""
for i := 0; i < 1000000; i++ {
s += "a"
}
fmt.Println(time.Now().Sub(n))
// 47.7s
Now, let's do the same but this time using the bytes.Buffer type:
var n = time.Now()
var b = bytes.NewBufferString("")
for i := 0; i < 1000000; i++ {
b.WriteString("a")
}
fmt.Println(time.Now().Sub(n))
// 18.5ms
The difference is the way Buffer allocates memory: when there is not enough capacity, it allocates twice the current capacity:
buf = makeSlice(2*cap(b.buf) + n)
Source
This works the same with slices (I was just not able to find the source code to prove it...). So yes, you may be losing some space, but this is for a much better efficiency!
You're second question is a bit more tricky for me, so I hope #Volker's answer will be clear enough for you !
I've been trying out Go for some time and this question keeps bugging me. Say I build up a somewhat large dataset in a slice (say, 10 million int64s).
package main
import (
"math"
"fmt"
)
func main() {
var a []int64
var i int64;
upto := int64(math.Pow10(7))
for i = 0; i < upto; i++ {
a = append(a, i)
}
fmt.Println(cap(a))
}
But then I decide I don't want most of them so I want to end up with a slice of just 10 of those. I've tried both slicing and delete techniques on Go's wiki but none of them seem to reduce the slice's capacity.
So that's my question: does Go has no real way of shrinking the capacity of a slice that would be similar to realloc()-ing with a smaller size argument than in your previous call on the same pointer in C? Is that an issue and how should one deal with it?
To perform an, in effect, a realloc of a slice:
a = append([]T(nil), a[:newSize]...) // Thanks to #Dijkstra for pointing out the missing ellipsis.
If it does a copy of newSize elements to a new memory place or if it does an actual in place resize as in realloc(3) is at complete discretion of the compiler. You might want to investigate the current state and perhaps raise an issue if there's a room for improvement in this.
However, this is likely a micro-optimization. The first source of performance enhancements lies almost always in selecting a better algorithm and/or a better data structure. Using a hugely sized vector to finally keep a few items only is probably not the best option wrt to memory consumption.
EDIT: The above is only partially correct. The compiler cannot, in the general case, derive if there are other pointers to the slice's backing array. Thus the realloc is not applicable. The above snippet is actually guaranteed to peform a copy of 'newSize' elements. Sorry for any confusion possibly created.
Go does not have a way of shrinking slices. This isn't a problem in most cases, but if you profile your memory use and find you're using too much, you can do something about it:
Firstly, you can just create a slice of the size you need and copy your data into it. The garbage collector will then free the large slice. Copy built-in
Secondly, you could re-use the big slice each time you wish to generate it, so you never allocate it more than once.
On a final note, you can use 1e7 instead of math.Pow10(7).
Let's see this example:
func main() {
s := []string{"A", "B", "C", "D", "E", "F", "G", "H"}
fmt.Println(s, len(s), cap(s)) // slice, length, capacity
t := s[2:4]
fmt.Println(t, len(t), cap(t))
u := make([]string, len(t))
copy(u, t)
fmt.Println(u, len(u), cap(u))
}
It produces the following output:
[A B C D E F G H] 8 8
[C D] 2 6
[C D] 2 2
s is a slice that holds 8 pieces of strings. t is a slice that keeps the part [C D]. The length of t is 2, but since it uses the same hidden array of s, its capacity is 6 (from "C" to "H"). The question is: how to have a slice of [C D] that is independent from the hidden array of s? Simply create a new slice of strings with length 2 (slice u) and copy the content of t to u. u's underlying hidden array is different from the hidden array of s.
The initial problem was this: you have a big slice and you create a new smaller slice on it. Since the smaller slice uses the same hidden array, the garbage collector won't delete the hidden array.
See the bottom of this post for more info: http://blog.golang.org/go-slices-usage-and-internals .
Additionally you can re-use most of the allocated memory during work of yours app, take a look at: bufs package
PS if you re-alocate new memory for smaller slice, old memory may not be freed in same time, it will be freed when garbage collector decides to.
You can do that by re-assigning the slice's value to a portion of itself
a := []int{1,2,3}
fmt.Println(len(a), a) // 3 [1 2 3]
a = a[:len(a)-1]
fmt.Println(len(a), a) //2 [1 2]
There is a new feature called 3-index slice in Go 1.2, which means to get part of a slice in this way:
slice[a:b:c]
In which the len for the returned slice whould be b-a, and the cav for the new slice would be c-a.
Tips: no copy is down in the whole process, it only returns a new slice which points to &slice[a] and has the len as b-a and cav as c-a.
And that's the only thing you have to do:
slice= slice[0:len(slice):len(slice)];
Then the cav of the slice would be changed to len(slice) - 0, which is the same as the len of it, and no copy is done.