Is there a bug in handling slices with references in Go? - go

I'm trying to build a new list of structs that contains references to items that exist in another slice. It's easier to understand if you see it, so I've prepared a snippet that you can run.
I have a list (dummylist) of two points (Cartesian coordinates) that I want to parse to build a new list (mylist) with items having some features (in the example, X > 80). I've defined two points: {X:90.0, Y:50.0} and {X:20.0 , Y:30.0}. I expect that mylist will contain {X:90.0, Y:50.0}, instead at the end there is {X:20.0 , Y:30.0}. With some print here and there I can verify that the algorithm is working fine (it enters in the "if" condition in the right case), but, at the end, "mylist" contains the wrong element.
package main
import(
"fmt"
)
func main() {
type point struct {
X float64
Y float64
}
type pointsList []point
type pointContainer struct {
Point *point
}
type pointContainerList []pointContainer
// Prepare a slice with two elements
dummylist := new(pointsList)
*dummylist = append(*dummylist, point{X:90.0, Y:50.0})
*dummylist = append(*dummylist, point{X:20.0 , Y:30.0})
// My empty list
mylist := new(pointContainerList)
fmt.Println(fmt.Sprintf("---- At the beginning, mylist contains %d points", len(*mylist)))
// Filter the initial list to take only elements
for _, pt := range *dummylist {
fmt.Println("\n---- Evaluating point ", pt)
if pt.X > 80 {
fmt.Println("Appending", pt)
*mylist = append(*mylist, pointContainer{Point: &pt})
fmt.Println("Inserted point:", (*mylist)[0].Point, "len = ", len(*mylist))
}
}
// mylist should contain {X:90.0, Y:50.0}, instead...
fmt.Println(fmt.Sprintf("\n---- At the end, mylist contains %d points", len(*mylist)))
fmt.Println("Content of mylist:", (*mylist)[0].Point)
}
Here you can run the code:
https://play.golang.org/p/AvrC3JJBLdT
Some helpful consideration:
I've seen through multiple tests that, at the end, mylist contains the last parsed item in the loop. I think there is a problem with references. It's like if the inserted item in the list (in the first iteration) is dependent on the "pt" of other iterations. Instead, if I use indexes (for i, pt := range *dummylist and (*dummylist)[i]), everything works fine.
Before talking about bugs in Golang... am I missing something?

Yes, you're missing something. On this line:
*mylist = append(*mylist, pointContainer{Point: &pt})
you're putting the address of the loop variable &pt into your structure. As the loop continues, the value of pt changes. (Or to put it another way, &pt will be the same pointer for each iteration of the loop).
From the go language specification:
...
The iteration values are assigned to the respective iteration
variables as in an assignment statement.
The iteration variables may be declared by the "range" clause using a
form of short variable declaration (:=). In this case their types are
set to the types of the respective iteration values and their scope is
the block of the "for" statement; they are re-used in each iteration.
If the iteration variables are declared outside the "for" statement,
after execution their values will be those of the last iteration.
One solution would be to create a new value, but I'm not sure what you're gaining from so many pointers: []point would probably be more effective (and less error-prone) than a pointer to a slice of structs of pointers to points.

Related

What happens when I range over an uninitialized pointer to array in golang

I have this code
var j *[33]byte
for i := range j {
fmt.Println(j[i])
}
Now when I run this code I get nil pointer dereference error when I try access values in j. I'm not sure why I was even able to enter the loop in the first place considering my pointer is uninitialized.
I know an uninitialized array has all its values set to their zero value. That is
var a [5]int
Will have a default value of [0, 0, 0, 0, 0].
But I don't understand what golang does when you don't initialize a pointer to an array. Why is range able to range over it even though its nil?
From the Go spec Range Clause:
... For an array, pointer to array, or slice value a, the index
iteration values are produced in increasing order...
so as a convenience the Go language is dereferencing the pointer with the intent to iterating over its elements. The fact that the pointer is nil is a simple programming error. If this can occur, one should have a runtime check in place to guard against it.
Static analysis may be able to detect this type of bug ahead of time - but what if the variable j is accessible from another goroutine - how would the compiler know for sure that another goroutine may update it to a non-nil value right before the range loop is reached?
Go has a zero value defined for each type when you initialize a variable with var keyword (this may change when using :=, ideally used when need copies of values or specific values). In the case of the pointer the zero value is nil (also maps, interfaces, channels, slices, and functions) in case of array of type int the zero value is 0.
So, to answer your question, Go is able to iterate because you have 33 valid spaces idependently of what value is inside of that position. You can check the diference between slices and arrays on the Golang documentation to have more insights on why is that.

Pointer operation isn't changing its reference within a slice

I just started learning go. I have a question about pointers.
In the code below, the following line in the code doesn't do what I expect:
last_line.Next_line = &line // slice doesn't change
I want the slice to be changed as well, not only the local variable last_line.
What am I doing wrong?
type Line struct {
Text string
Prev_line *Line
Next_line *Line
}
var (
lines []Line
last_line *Line
)
for i, record := range records {
var prev_line *Line = nil
text := record[0]
if i > 0 {
prev_line = &lines[i-1]
}
line := Line{
Text: text,
Prev_line: prev_line,
Next_line: nil}
if last_line != nil {
last_line.Next_line = &line // slice doesn't change
}
lines = append(lines, line)
last_line = &line
}
Your Line type is a fairly standard-looking doubly linked list. Your lines variable holds a slice of these objects. Combining these two is a bit unusual—not wrong, to be sure, just unusual. And, as Matt Oestreich notes in a comment, we don't know quite what is in records (just that range can be used on it and that after doing so, we can use record[0] to get to a single string value), so there might be better ways to deal with things.
If records itself is a slice or has a sensible len, we can allocate a slice of Line instances all at once, of the appropriate size:
lines = make([]Line, len(records))
Here is a sample on the Go Playground that does it this way.
If we can't really get a suitable len—e.g., if records is a channel whose length is not really relevant—then we might indeed want to allocate individual lines, but in this case, it may be more sensible to avoid keeping them as a slice in the first place. The doubly linked list alone will suffice.
Finally, if you really do want both a slice and this doubly linked list, note that using append may copy the slice's elements to a new, larger slice. If and when it does so, the pointers in any elements you set up earlier will point into the old, smaller slice. This is not invalid in terms of the language itself—those objects still exist and your pointers are keeping them "alive"—but it may not be what you intended at all. In this case, it makes more sense to set all the pointers at the end, after building up the lines slice, just as in the sample code I provided.
(The sample I wrote is deliberately slightly weird in a way that is likely to get your homework or test grade knocked down a bit, if this was an attempt to cheat on homework or a test. :-) )

How to Define a Constant Value of a User-defined Type in Go?

I am implementing a bit-vector in Go:
// A bit vector uses a slice of unsigned integer values or “words,”
// each bit of which represents an element of the set.
// The set contains i if the ith bit is set.
// The following program demonstrates a simple bit vector type with these methods.
type IntSet struct {
words []uint64 //uint64 is important because we need control over number and value of bits
}
I have defined several methods (e.g. membership test, adding or removing elements, set operations like union, intersection etc.) on it which all have a pointer receiver. Here is one such method:
// Has returns true if the given integer is in the set, false otherwise
func (this *IntSet) Has(m int) bool {
// details omitted for brevity
}
Now, I need to return an empty set that is a true constant, so that I can use the same constant every time I need to refer to an IntSet that contains no elements. One way is to return something like &IntSet{}, but I see two disadvantages:
Every time an empty set is to be returned, a new value needs to be allocated.
The returned value is not really constant since it can be modified by the callers.
How do you define a null set that does not have these limitations?
If you read https://golang.org/ref/spec#Constants you see that constants are limited to basic types. A struct or a slice or array will not work as a constant.
I think that the best you can do is to make a function that returns a copy of an internal empty set. If callers modify it, that isn't something you can fix.
Actually modifying it would be difficult for them since the words inside the IntSet are lowercase and therefore private. If you added a value next to words like mut bool you could add a if mut check to every method that changes the IntSet. If it isn't mutable, return an error or panic.
With that, you could keep users from modifying constant, non-mutable IntSet values.

Go error: non-constant array bound

I'm trying to calculate the necessary length for an array in a merge sort implementation I'm writing in go. It looks like this:
func merge(array []int, start, middle, end int) {
leftLength := middle - start + 1
rightLength := end - middle
var left [leftLength]int
var right [rightLength]int
//...
}
I then get this complaint when running go test:
./mergesort.go:6: non-constant array bound leftLength
./mergesort.go:7: non-constant array bound rightLength
I assume go does not enjoy users instantiating an Array's length with a calculated value. It only accepts constants. Should I just give up and use a slice instead? I expect a slice is a dynamic array meaning it's either a linked list or copies into a larger array when it gets full.
You can't instantiate an array like that with a value calculated at runtime. Instead use make to initialize a slice with the desired length. It would look like this;
left := make([]int, leftLength)

Making maps in go before anything

I am following the go tour and something bothered me.
Maps must be created with make (not new) before use
Fair enough:
map = make(map[int]Cats)
However the very next slide shows something different:
var m = map[string]Vertex{
"Bell Labs": Vertex{
40.68433, -74.39967,
},
"Google": Vertex{
37.42202, -122.08408,
},
}
This slide shows how you can ignore make when creating maps
Why did the tour say maps have to be created with make before they can be used? Am I missing something here?
Actually the only reason to use make to create a map is to preallocate a specific number of values, just like with slices (except you can't set a cap on a map)
m := map[int]Cats{}
s := []Cats{}
//is the same as
m := make(map[int]Cats)
s := make([]Cats, 0, 0)
However if you know you will have a minimum of X amount of items in a map you can do something like:
m := make(map[int]Cats, 100)// this will speed things up initially
Also check http://dave.cheney.net/2014/08/17/go-has-both-make-and-new-functions-what-gives
So they're actually right that you always need to use make before using a map. The reason it looks like they aren't in the example you gave is that the make call happens implicitly. So, for example, the following two are equivalent:
m := make(map[int]string)
m[0] = "zero"
m[1] = "one"
// Equivalent to:
m := map[int]string{
0: "zero",
1: "one",
}
Make vs New
Now, the reason to use make vs new is slightly more subtle. The reason is that new only allocates space for a variable of the given type, whereas make actually initializes it.
To give you a sense of this distinction, imagine we had a binary tree type like this:
type Tree struct {
root *node
}
type node struct {
val int
left, right *node
}
Now you can imagine that if we had a Tree which was allocated and initialized and had some values in it, and we made a copy of that Tree value, the two values would point to the same underlying data since they'd both have the same value for root.
So what would happen if we just created a new Tree without initializing it? Something like t := new(Tree) or var t Tree? Well, t.root would be nil, so if we made a copy of t, both variables would not point to the same underlying data, and so if we added some elements to the Tree, we'd end up with two totally separate Trees.
The same is true of maps and slices (and some others) in Go. When you make a copy of a slice variable or a map variable, both the old and the new variables refer to the same underlying data, just like an array in Java or C. Thus, if you just use new, and then make a copy and initialize the underlying data later, you'll have two totally separate data structures, which is usually not what you want.

Resources