Go error: non-constant array bound - go

I'm trying to calculate the necessary length for an array in a merge sort implementation I'm writing in go. It looks like this:
func merge(array []int, start, middle, end int) {
leftLength := middle - start + 1
rightLength := end - middle
var left [leftLength]int
var right [rightLength]int
//...
}
I then get this complaint when running go test:
./mergesort.go:6: non-constant array bound leftLength
./mergesort.go:7: non-constant array bound rightLength
I assume go does not enjoy users instantiating an Array's length with a calculated value. It only accepts constants. Should I just give up and use a slice instead? I expect a slice is a dynamic array meaning it's either a linked list or copies into a larger array when it gets full.

You can't instantiate an array like that with a value calculated at runtime. Instead use make to initialize a slice with the desired length. It would look like this;
left := make([]int, leftLength)

Related

What happens when I range over an uninitialized pointer to array in golang

I have this code
var j *[33]byte
for i := range j {
fmt.Println(j[i])
}
Now when I run this code I get nil pointer dereference error when I try access values in j. I'm not sure why I was even able to enter the loop in the first place considering my pointer is uninitialized.
I know an uninitialized array has all its values set to their zero value. That is
var a [5]int
Will have a default value of [0, 0, 0, 0, 0].
But I don't understand what golang does when you don't initialize a pointer to an array. Why is range able to range over it even though its nil?
From the Go spec Range Clause:
... For an array, pointer to array, or slice value a, the index
iteration values are produced in increasing order...
so as a convenience the Go language is dereferencing the pointer with the intent to iterating over its elements. The fact that the pointer is nil is a simple programming error. If this can occur, one should have a runtime check in place to guard against it.
Static analysis may be able to detect this type of bug ahead of time - but what if the variable j is accessible from another goroutine - how would the compiler know for sure that another goroutine may update it to a non-nil value right before the range loop is reached?
Go has a zero value defined for each type when you initialize a variable with var keyword (this may change when using :=, ideally used when need copies of values or specific values). In the case of the pointer the zero value is nil (also maps, interfaces, channels, slices, and functions) in case of array of type int the zero value is 0.
So, to answer your question, Go is able to iterate because you have 33 valid spaces idependently of what value is inside of that position. You can check the diference between slices and arrays on the Golang documentation to have more insights on why is that.

Is there a bug in handling slices with references in Go?

I'm trying to build a new list of structs that contains references to items that exist in another slice. It's easier to understand if you see it, so I've prepared a snippet that you can run.
I have a list (dummylist) of two points (Cartesian coordinates) that I want to parse to build a new list (mylist) with items having some features (in the example, X > 80). I've defined two points: {X:90.0, Y:50.0} and {X:20.0 , Y:30.0}. I expect that mylist will contain {X:90.0, Y:50.0}, instead at the end there is {X:20.0 , Y:30.0}. With some print here and there I can verify that the algorithm is working fine (it enters in the "if" condition in the right case), but, at the end, "mylist" contains the wrong element.
package main
import(
"fmt"
)
func main() {
type point struct {
X float64
Y float64
}
type pointsList []point
type pointContainer struct {
Point *point
}
type pointContainerList []pointContainer
// Prepare a slice with two elements
dummylist := new(pointsList)
*dummylist = append(*dummylist, point{X:90.0, Y:50.0})
*dummylist = append(*dummylist, point{X:20.0 , Y:30.0})
// My empty list
mylist := new(pointContainerList)
fmt.Println(fmt.Sprintf("---- At the beginning, mylist contains %d points", len(*mylist)))
// Filter the initial list to take only elements
for _, pt := range *dummylist {
fmt.Println("\n---- Evaluating point ", pt)
if pt.X > 80 {
fmt.Println("Appending", pt)
*mylist = append(*mylist, pointContainer{Point: &pt})
fmt.Println("Inserted point:", (*mylist)[0].Point, "len = ", len(*mylist))
}
}
// mylist should contain {X:90.0, Y:50.0}, instead...
fmt.Println(fmt.Sprintf("\n---- At the end, mylist contains %d points", len(*mylist)))
fmt.Println("Content of mylist:", (*mylist)[0].Point)
}
Here you can run the code:
https://play.golang.org/p/AvrC3JJBLdT
Some helpful consideration:
I've seen through multiple tests that, at the end, mylist contains the last parsed item in the loop. I think there is a problem with references. It's like if the inserted item in the list (in the first iteration) is dependent on the "pt" of other iterations. Instead, if I use indexes (for i, pt := range *dummylist and (*dummylist)[i]), everything works fine.
Before talking about bugs in Golang... am I missing something?
Yes, you're missing something. On this line:
*mylist = append(*mylist, pointContainer{Point: &pt})
you're putting the address of the loop variable &pt into your structure. As the loop continues, the value of pt changes. (Or to put it another way, &pt will be the same pointer for each iteration of the loop).
From the go language specification:
...
The iteration values are assigned to the respective iteration
variables as in an assignment statement.
The iteration variables may be declared by the "range" clause using a
form of short variable declaration (:=). In this case their types are
set to the types of the respective iteration values and their scope is
the block of the "for" statement; they are re-used in each iteration.
If the iteration variables are declared outside the "for" statement,
after execution their values will be those of the last iteration.
One solution would be to create a new value, but I'm not sure what you're gaining from so many pointers: []point would probably be more effective (and less error-prone) than a pointer to a slice of structs of pointers to points.

Is there a way to delete first element from map?

Can I delete the first element in map? It is possible with slices slice = append(slice, slice[1:]...), but can I do something like this with maps?
Maps being hashtables don't have a specified order, so there's no way to delete keys in a defined order, unless you track keys in a separate slice, in the order you're adding them, something like:
type orderedMap struct {
data map[string]int
keys []string
mu *sync.RWMutex
}
func (o *orderedMap) Shift() (int, error) {
o.mu.Lock()
defer o.mu.Unlock()
if len(o.keys) == 0 {
return 0, ErrMapEmpty
}
i := o.data[o.keys[0]]
delete(o.data, o.keys[0])
o.keys = o.keys[1:]
return i, nil
}
Just to be unequivocal about why you can't really delete the "first" element from a map, let me reference the spec:
A map is an unordered group of elements of one type, called the element type, indexed by a set of unique keys of another type, called the key type. The value of an uninitialized map is nil.
Added the emphasis on the fact that map items are unordered
Using a slice to preserve some notion of the order of keys is, fundamentally, flawed, though. Given operations like this:
foo := map[string]int{
"foo": 1,
"bar": 2,
}
// a bit later:
foo["foo"] = 3
Is the index/key foo now updated, or reassigned? Should it be treated as a new entry, appended to the slice if keys, or is it an in-place update? Things get muddled really quickly. The simple fact of the matter is that the map type doesn't contain an "order" of things, trying to make it have an order quickly devolves in a labour intensive task where you'll end up writing your own type.
As I said earlier: it's a hashtable. Elements within get reshuffled behind the scenes if the hashing algorithm used for the keys produces collisions, for example. This question has the feel of an X-Y problem: why do you need the values in the map to be ordered? Maybe a map simply isn't the right approach for your particular problem.

Randomly selecting elements from slices produced by a map in restricted key range in golang. Is there an O(1) shortcut?

In my program to simulate many-particle evolution, I have a map that takes a key value pop (the population size) and returns a slice containing the sites that have this population: myMap[pop][]int. These slices are generically quite large.
At each evolution step I choose a random population size RandomPop. I would then like to randomly choose a site that has a population of at least RandomPop. The sitechosen is used to update my population structures and I utilize a second map to efficiently update myMap keys. My current (slow) implementation looks like
func Evolve( ..., myMap map[int][]int ,...){
RandomPop = rand.Intn(rangeofpopulation)+1
for i:=RandPop,; i<rangeofpopulation;i++{
preallocatedslice=append(preallocatedslice,myMap[i]...)
}
randomindex:= rand.Intn(len(preallocatedslice))
sitechosen= preallocatedslice[randomindex]
UpdateFunction(site)
//reset preallocated slice
preallocatedslice=preallocatedslice[0:0]
}
This code (obviously) hits a huge bottle-neck when copying values from the map to preallocatedslice, with runtime.memmove eating 87% of my CPU usage. I'm wondering if there is an O(1) way to randomly choose an entry contained in the union of slices indicated by myMap with key values between 0 and RandomPop ? I am open to packages that allow you to manipulate custom hashtables if anyone is aware of them. Suggestions don't need to be safe for concurrency
Other things tried: I previously had my maps record all sites with values of at least pop but that took up >10GB of memory and was stupid. I tried stashing pointers to the relevant slices to make a look-up slice, but go forbids this. I could sum up the lengths of each slice and generate a random number based on this and then iterate through the slices in myMap by length, but this is going to be much slower than just keeping an updated cdf of my population and doing a binary search on it. The binary search is fast, but updating the cdf, even if done manually, is O(n). I was really hoping to abuse hashtables to speed up random selection and update if possible
A vague thought I have is concocting some sort of nested structure of maps pointing to their contents and also to the map with a key one less than theirs or something.
I was looking at your code and I have a question.
Why do you have to copy values from the map to the slice? I mean, I think that I am following the logic behind... but I wonder if there is a way to skip that step.
So we have:
func Evolve( ..., myMap map[int][]int ,...){
RandomPop = rand.Intn(rangeofpopulation)+1
for i:=RandPop,; i<rangeofpopulation;i++{
// slice of preselected `sites`. one of this will be 'siteChosen'
// we expect to have `n sites` on `preAllocatedSlice`
// where `n` is the amount of iterations,
// ie; n = rangeofpopulation - RandPop
preallocatedslice=append(preallocatedslice,myMap[i]...)
}
// Once we have a list of sites, we select `one`
// under a normal distribution every site ha a chance of 1/n to be selected.
randomindex:= rand.Intn(len(preallocatedslice))
sitechosen= preallocatedslice[randomindex]
UpdateFunction(site)
...
}
But what if we change that to:
func Evolve( ..., myMap map[int][]int ,...){
if len(myMap) == 0 {
// Nothing to do, print a log!
return
}
// This variable will hold our site chosen!
var siteChosen []int
// Our random population size is a value from 1 to rangeOfPopulation
randPopSize := rand.Intn(rangeOfPopulation) + 1
for i := randPopSize; i < rangeOfPopulation; i++ {
// We are going to pretend that the current candidate is the siteChosen
siteChosen = myMap[i]
// Now, instead of copying `myMap[i]` to preAllocatedSlice
// We will test if the current candidate is actually the 'siteChosen` here:
// We know that the chances for an specific site to be the chosen is 1/n,
// where n = rangeOfPopulation - randPopSize
n := float64(rangeOfPopulation - randPopSize)
// we roll the dice...
isTheChosenOne := rand.Float64() > 1/n
if isTheChosenOne {
// If the candidate is the Chosen site,
// then we don't need to iterate over all the other elements.
break
}
}
// here we know that `siteChosen` is a.- a selected candidate, or
// b.- the last element assigned in the loop
// (in the case that `isTheChosenOne` was always false [which is a probable scenario])
UpdateFunction(siteChosen)
...
}
Also if you want to can calculate n, or 1/n outside the loop.
So the idea is testing inside the loop if the candidate is the siteChosen, and avoid copying the candidates to this preselection pool.

Making maps in go before anything

I am following the go tour and something bothered me.
Maps must be created with make (not new) before use
Fair enough:
map = make(map[int]Cats)
However the very next slide shows something different:
var m = map[string]Vertex{
"Bell Labs": Vertex{
40.68433, -74.39967,
},
"Google": Vertex{
37.42202, -122.08408,
},
}
This slide shows how you can ignore make when creating maps
Why did the tour say maps have to be created with make before they can be used? Am I missing something here?
Actually the only reason to use make to create a map is to preallocate a specific number of values, just like with slices (except you can't set a cap on a map)
m := map[int]Cats{}
s := []Cats{}
//is the same as
m := make(map[int]Cats)
s := make([]Cats, 0, 0)
However if you know you will have a minimum of X amount of items in a map you can do something like:
m := make(map[int]Cats, 100)// this will speed things up initially
Also check http://dave.cheney.net/2014/08/17/go-has-both-make-and-new-functions-what-gives
So they're actually right that you always need to use make before using a map. The reason it looks like they aren't in the example you gave is that the make call happens implicitly. So, for example, the following two are equivalent:
m := make(map[int]string)
m[0] = "zero"
m[1] = "one"
// Equivalent to:
m := map[int]string{
0: "zero",
1: "one",
}
Make vs New
Now, the reason to use make vs new is slightly more subtle. The reason is that new only allocates space for a variable of the given type, whereas make actually initializes it.
To give you a sense of this distinction, imagine we had a binary tree type like this:
type Tree struct {
root *node
}
type node struct {
val int
left, right *node
}
Now you can imagine that if we had a Tree which was allocated and initialized and had some values in it, and we made a copy of that Tree value, the two values would point to the same underlying data since they'd both have the same value for root.
So what would happen if we just created a new Tree without initializing it? Something like t := new(Tree) or var t Tree? Well, t.root would be nil, so if we made a copy of t, both variables would not point to the same underlying data, and so if we added some elements to the Tree, we'd end up with two totally separate Trees.
The same is true of maps and slices (and some others) in Go. When you make a copy of a slice variable or a map variable, both the old and the new variables refer to the same underlying data, just like an array in Java or C. Thus, if you just use new, and then make a copy and initialize the underlying data later, you'll have two totally separate data structures, which is usually not what you want.

Resources