Why don't Go slices just switch the underlying array on reallocation? - go

A slice contains three components: length, capacity and a pointer to the underlying array.
When we try to append to a slice that is full (len(s) == cap(s)), a larger array will be allocated.
I read in a book that we have to assign the return value of append back to the slice, because a different slice may be returned due to reallocation of the underlying array.
runes = append(runes, r)
But I don't know why this is necessary. Can't we just reallocate a new array and update the pointer of the original slice instance?

All function arguments are passed by value in Go. A function cannot change the caller's value.
The slice (length, capacity, pointer) is passed by value to the append function. Because append cannot change the caller's slice, the append function returns the new slice.
The append function could be written to take a pointer to a slice, but that would make append awkward to use in the many situations where slice values are not addressable.

Related

What happens when I range over an uninitialized pointer to array in golang

I have this code
var j *[33]byte
for i := range j {
fmt.Println(j[i])
}
Now when I run this code I get nil pointer dereference error when I try access values in j. I'm not sure why I was even able to enter the loop in the first place considering my pointer is uninitialized.
I know an uninitialized array has all its values set to their zero value. That is
var a [5]int
Will have a default value of [0, 0, 0, 0, 0].
But I don't understand what golang does when you don't initialize a pointer to an array. Why is range able to range over it even though its nil?
From the Go spec Range Clause:
... For an array, pointer to array, or slice value a, the index
iteration values are produced in increasing order...
so as a convenience the Go language is dereferencing the pointer with the intent to iterating over its elements. The fact that the pointer is nil is a simple programming error. If this can occur, one should have a runtime check in place to guard against it.
Static analysis may be able to detect this type of bug ahead of time - but what if the variable j is accessible from another goroutine - how would the compiler know for sure that another goroutine may update it to a non-nil value right before the range loop is reached?
Go has a zero value defined for each type when you initialize a variable with var keyword (this may change when using :=, ideally used when need copies of values or specific values). In the case of the pointer the zero value is nil (also maps, interfaces, channels, slices, and functions) in case of array of type int the zero value is 0.
So, to answer your question, Go is able to iterate because you have 33 valid spaces idependently of what value is inside of that position. You can check the diference between slices and arrays on the Golang documentation to have more insights on why is that.

How is the itab struct actually having a list of function pointers?

Researching the interface value in go - I found a great (maybe outdated) article by Russ Cox.
According to it:
The itable begins with some metadata about the types involved and then becomes a list of function pointers.
The implementation for this itable should be the one from src/runtime/runtime2.go:
type itab struct {
inter *interfacetype
_type *_type
hash uint32 // copy of _type.hash. Used for type switches.
_ [4]byte
fun [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}
First confusing thing is - how is an array - variable sized?
Second, assuming that we have a function pointer at index 0 for a method that satisfies the interface, where could we store a second/third/... function pointer?
The compiled code and runtime access fun as if the field is declared fun [n]uintpr where n is the number of methods in the interface. The second method is stored at fun[1], the third at fun[2] and so on. The Go Language does not have a variable size array feature like this, but unsafe shenanigans can be used to simulate the feature.
Here's how itab is allocated:
m = (*itab)(persistentalloc(unsafe.Sizeof(itab{})+uintptr(len(inter.mhdr)-1)*goarch.PtrSize, 0, &memstats.other_sys))
The function persistentalloc allocates memory. The first argument to the function is the size to allocate. The expression inter.mhdr is the number of methods in the interface.
Here's code that creates a slice on the variable size array:
methods := (*[1 << 16]unsafe.Pointer)(unsafe.Pointer(&m.fun[0]))[:ni:ni]
The expression methods[i] refers to the same element as m.fun[i] in a hypothetical world where m.fun is a variable size array with length > i. Later code uses normal slice syntax with methods to access the variable size array m.fun.

How to Define a Constant Value of a User-defined Type in Go?

I am implementing a bit-vector in Go:
// A bit vector uses a slice of unsigned integer values or “words,”
// each bit of which represents an element of the set.
// The set contains i if the ith bit is set.
// The following program demonstrates a simple bit vector type with these methods.
type IntSet struct {
words []uint64 //uint64 is important because we need control over number and value of bits
}
I have defined several methods (e.g. membership test, adding or removing elements, set operations like union, intersection etc.) on it which all have a pointer receiver. Here is one such method:
// Has returns true if the given integer is in the set, false otherwise
func (this *IntSet) Has(m int) bool {
// details omitted for brevity
}
Now, I need to return an empty set that is a true constant, so that I can use the same constant every time I need to refer to an IntSet that contains no elements. One way is to return something like &IntSet{}, but I see two disadvantages:
Every time an empty set is to be returned, a new value needs to be allocated.
The returned value is not really constant since it can be modified by the callers.
How do you define a null set that does not have these limitations?
If you read https://golang.org/ref/spec#Constants you see that constants are limited to basic types. A struct or a slice or array will not work as a constant.
I think that the best you can do is to make a function that returns a copy of an internal empty set. If callers modify it, that isn't something you can fix.
Actually modifying it would be difficult for them since the words inside the IntSet are lowercase and therefore private. If you added a value next to words like mut bool you could add a if mut check to every method that changes the IntSet. If it isn't mutable, return an error or panic.
With that, you could keep users from modifying constant, non-mutable IntSet values.

Append a slice from a map value does not affect the map

mp := map[int][]int{}
slice := make([]int, 0, 1)
fmt.Printf("slice address:%p\n", slice)
mp[0] = slice
slice = append(slice, 1)
fmt.Println("after append")
fmt.Printf("slice address:%p\n", slice)
fmt.Println("slice:", slice)
fmt.Println("mp[0]:", mp[0])
fmt.Printf("mp[0] address:%p\n", mp[0])
output:
slice address:0xc042008f78
after append
slice address:0xc042008f78
slice: [1]
mp[0]: []
mp[0] address:0xc042008f78
The address of the slice does not change as its cap is large enough during append. So why the map value does not take effect?
As explained in Go Slices: usage and internals, two slices may point to the same memory location, but may have different len and cap attributes.
In Golang it is mentioned in blog on Go Slices: usage and internals
Slicing does not copy the slice's data. It creates a new slice value
that points to the original array. This makes slice operations as
efficient as manipulating array indices. Therefore, modifying the
elements (not the slice itself) of a re-slice modifies the elements of
the original slice:
slice = append(slice, 1)
So in the above case it is creating a new slice with pointing to the same original underlying array. That is the reason it is showing the same address.
To get the data of underlying array pointed by slice use reflect with unsafe:
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
data := *(*[1]int)(unsafe.Pointer(hdr.Data))
Working code on Playground
This is caused by the fact, that multiple slices can be backed by the same data but use different "sections" of the data. This means, that yes, an element is added to the data backing mp[0], but the length of the slice in mp is not changed. You can do that manually:
fmt.Println(mp[0][:1])
which does print [1].
You can grow any slice to it's capacity without changing the underlying data by using slice[:cap(slice)]. slice[:n] will panic if cap(slice) < n.
slice[n] on the other hand will panic when len(slice) <= n. I assume that the former is possible to allow the growing of slices without changing the underlying data (as far as that is possible). The latter, I would say, is "normal" behavior.
This also explains why mp[0][:2] panics, as cap(mp[0]) is 1.
For more details you might want to read this official blog post, as suggested by Flimzy.

How to remove the last element from a slice?

I've seen people say just create a new slice by appending the old one
*slc = append(*slc[:item], *slc[item+1:]...)
but what if you want to remove the last element in the slice?
If you try to replace i (the last element) with i+1, it returns an out of bounds error since there is no i+1.
You can use len() to find the length and re-slice using the index before the last element:
if len(slice) > 0 {
slice = slice[:len(slice)-1]
}
Click here to see it in the playground
TL;DR:
myslice = myslice[:len(myslice) - 1]
This will fail if myslice is zero sized.
Longer answer:
Slices are data structures that point to an underlying array and operations like slicing a slice use the same underlying array.
That means that if you slice a slice, the new slice will still be pointing to the same data as the original slice.
By doing the above, the last element will still be in the array, but you won't be able to reference it anymore.
If you reslice the slice to its original length you'll be able to reference the last object
If you have a really big slice and you want to also prune the underlying array to save memory, you probably wanna use "copy" to create a new slice with a smaller underlying array and let the old big slice get garbage collected.

Resources