Golang slice append and reallocation - go

I've been learning go recently and had a question about the behavior of slices when reallocation occurs. Assume I have a slice of pointers to a struct, such as:
var a []*A
If I were to pass this slice to another function, and my understanding is that internally this passes a slice header by value, that runs on a separate goroutine and just reads from the slice, while the function that launched the goroutine continues to append to the slice, is that a problem? For example:
package main
type A struct {
foo int
}
func main() {
a := make([]*A, 0, 100)
ch := make(chan int)
for i := 0; i < 100; i++ {
a = append(a, &A{i})
}
go read_slice(a, ch)
for i := 0; i < 100; i++ {
a = append(a, &A{i+100})
}
<-ch
}
func read_slice(a []*A, ch chan int) {
for i := range a {
fmt.Printf("%d ", a[i].foo)
}
ch <- 1
}
So from my understanding, as the read_slice() function is running on its own goroutine with a copy of the slice header, it has an underlying pointer to the current backing array and the size at the time it was called through which I can access the foo's.
However, when the other goroutine is appending to the slice it will trigger a reallocation when the capacity is exceeded. Does the go runtime not deallocate the memory to the old backing array being used in read_slice() since there is a reference to it in that function?
I tried running this with "go run -race slice.go" but that didn't report anything, but I feel like I might be doing something wrong here? Any pointers would be appreciated.
Thanks!

The GC does not collect the backing array until there are no references to the backing array. There are no races in the program.
Consider the scenario with no goroutines:
a := make([]*A, 0, 100)
for i := 0; i < 100; i++ {
a = append(a, &A{i})
}
b := a
for i := 0; i < 100; i++ {
b = append(b, &A{i+100})
}
The slice a will continue to reference the backing array with the first 100 pointers when append to b allocates a new backing array. The slice a is not left with a dangling reference to a backing array.
Now add the goroutine to the scenario:
a := make([]*A, 0, 100)
for i := 0; i < 100; i++ {
a = append(a, &A{i})
}
b := a
go read_slice(a, ch)
for i := 0; i < 100; i++ {
b = append(b, &A{i+100})
}
The goroutine can happily use slice a. There's no dangling reference.
Now consider the program in the question. It's functionaly identical to the last snippet here.

Related

Deleting concurrently from a slice

I've got an array X and I'm trying to range over it and delete some elements if a condition is true.
Doing this without any concurrency works fine, but when I try to do this concurrently I get the error:
"slice bounds out of range"
func main() {
X := make([][]int32, 10)
for i := 0; i < 10; i++ {
X[i] = []int32{int32(i), int32(i)}
}
ch := make(chan int, 20)
var wg sync.WaitGroup
fmt.Println(X)
for i := 0; i < 20; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for index := range X {
// check some condition
if X[index][0]%2 == 0 && index < len(X)-1 {
ch <- index
break
}
}
}()
}
for {
ind, ok := <-ch
if ok {
X = append(X[:ind], X[ind+1:]...)
} else {
fmt.Println("closed chan")
break
}
}
wg.Wait()
fmt.Println(X)
}
https://play.golang.org/p/YeLamAU5_Rt
Is there any way to use the indexes and send it over a channel then delete the corresponding elements from different goroutines?
First: why are you trying to delete like this? Is there a larger problem you are trying to solve?
If this is how you want to do it, then:
You are using a buffered channel, so the deleting goroutine can be deleting items from the slice while the other goroutine is working on it, causing unexpected results. Also, you are not closing the channel, so the for-loop never terminates.
Use an unbuffered channel, and run the deletion loop in a separate goroutine. Then close the channel after wg.Wait to terminate the deleting goroutine.
Copy the elements that you want to keep to the beginning of the slice. Drop the goroutines.
keep := 0
for index := range X {
if X[index][0]%2 == 0 && index < len(X)-1 {
continue
}
X[keep] = X[index]
keep++
}
X = X[:keep]
Run it on the playground.

LoadOrStore in a sync.Map without creating a new structure each time

Is it possible to LoadOrStore into a Go sync.Map without creating a new structure every time? If not, what alternatives are available?
The use case here is if I'm using the sync.Map as a cache where cache misses are rare (but possible) and on a cache miss I want to add to the map, I need to initialize a structure every single time LoadOrStore is called rather than just creating the struct when needed. I'm worried this will hurt the GC, initializing hundreds of thousands of structures that will not be needed.
In Java this can be done using computeIfAbsent.
you can try:
var m sync.Map
s, ok := m.Load("key")
if !ok {
s, _ = m.LoadOrStore("key", "value")
}
fmt.Println(s)
play demo
This is my solution: use sync.Map and sync.One
type syncData struct {
data interface{}
once *sync.Once
}
func LoadOrStore(m *sync.Map, key string, f func() (interface{}, error)) (interface{}, error) {
temp, _ := m.LoadOrStore(key, &syncData{
data: nil,
once: &sync.Once{},
})
d := temp.(*syncData)
var err error
if d.data == nil {
d.once.Do(func() {
d.data, err = f()
if err != nil {
//if failed, will try again by new sync.Once
d.once = &sync.Once{}
}
})
}
return d.data, err
}
Package sync
import "sync"
type Map
Map is like a Go map[interface{}]interface{} but is safe for
concurrent use by multiple goroutines without additional locking or
coordination. Loads, stores, and deletes run in amortized constant
time.
The Map type is specialized. Most code should use a plain Go map
instead, with separate locking or coordination, for better type safety
and to make it easier to maintain other invariants along with the map
content.
The Map type is optimized for two common use cases: (1) when the entry
for a given key is only ever written once but read many times, as in
caches that only grow, or (2) when multiple goroutines read, write,
and overwrite entries for disjoint sets of keys. In these two cases,
use of a Map may significantly reduce lock contention compared to a Go
map paired with a separate Mutex or RWMutex.
The usual way to solve these problems is to construct a usage model and then benchmark it.
For example, since "cache misses are rare", assume that Load wiil work most of the time and only LoadOrStore (with value allocation and initialization) when necessary.
$ go test map_test.go -bench=. -benchmem
BenchmarkHit-4 2 898810447 ns/op 44536 B/op 1198 allocs/op
BenchmarkMiss-4 1 2958103053 ns/op 483957168 B/op 43713042 allocs/op
$
map_test.go:
package main
import (
"strconv"
"sync"
"testing"
)
func BenchmarkHit(b *testing.B) {
for N := 0; N < b.N; N++ {
var m sync.Map
for i := 0; i < 64*1024; i++ {
for k := 0; k < 256; k++ {
// Assume cache hit
v, ok := m.Load(k)
if !ok {
// allocate and initialize value
v = strconv.Itoa(k)
a, loaded := m.LoadOrStore(k, v)
if loaded {
v = a
}
}
_ = v
}
}
}
}
func BenchmarkMiss(b *testing.B) {
for N := 0; N < b.N; N++ {
var m sync.Map
for i := 0; i < 64*1024; i++ {
for k := 0; k < 256; k++ {
// Assume cache miss
// allocate and initialize value
var v interface{} = strconv.Itoa(k)
a, loaded := m.LoadOrStore(k, v)
if loaded {
v = a
}
_ = v
}
}
}
}

atomic AddUint32 overflow

I'm using the below code to get unique IDs within process:
for i := 0; i < 10; i++ {
go func() {
for {
atomic.AddUint32(&counter, 1)
time.Sleep(time.Millisecond)
}
}()
}
What will happen if the counter value overflows uint32's limit?
The value wraps around, which is very easy to demonstrate:
u := uint32(math.MaxUint32)
fmt.Println(u)
u++
fmt.Println(u)
// or
u = math.MaxUint32
atomic.AddUint32(&u, 1)
fmt.Println(u)
https://play.golang.org/p/lCOM3nMYNc

Saving results from a parallelized goroutine

I am trying to parallelize an operation in golang and save the results in a manner that I can iterate over to sum up afterwords.
I have managed to set up the parameters so that no deadlock occurs, and I have confirmed that the operations are working and being saved correctly within the function. When I iterate over the Slice of my struct and try and sum up the results of the operation, they all remain 0. I have tried passing by reference, with pointers, and with channels (causes deadlock).
I have only found this example for help: https://golang.org/doc/effective_go.html#parallel. But this seems outdated now, as Vector as been deprecated? I also have not found any references to the way this function (in the example) was constructed (with the func (u Vector) before the name). I tried replacing this with a Slice but got compile time errors.
Any help would be very appreciated. Here is the key parts of my code:
type job struct {
a int
b int
result *big.Int
}
func choose(jobs []Job, c chan int) {
temp := new(big.Int)
for _,job := range jobs {
job.result = //perform operation on job.a and job.b
//fmt.Println(job.result)
}
c <- 1
}
func main() {
num := 100 //can be very large (why we need big.Int)
n := num
k := 0
const numCPU = 6 //runtime.NumCPU
count := new(big.Int)
// create a 2d slice of jobs, one for each core
jobs := make([][]Job, numCPU)
for (float64(k) <= math.Ceil(float64(num / 2))) {
// add one job to each core, alternating so that
// job set is similar in difficulty
for i := 0; i < numCPU; i++ {
if !(float64(k) <= math.Ceil(float64(num / 2))) {
break
}
jobs[i] = append(jobs[i], Job{n, k, new(big.Int)})
n -= 1
k += 1
}
}
c := make(chan int, numCPU)
for i := 0; i < numCPU; i++ {
go choose(jobs[i], c)
}
// drain the channel
for i := 0; i < numCPU; i++ {
<-c
}
// computations are done
for i := range jobs {
for _,job := range jobs[i] {
//fmt.Println(job.result)
count.Add(count, job.result)
}
}
fmt.Println(count)
}
Here is the code running on the go playground https://play.golang.org/p/X5IYaG36U-
As long as the []Job slice is only modified by one goroutine at a time, there's no reason you can't modify the job in place.
for i, job := range jobs {
jobs[i].result = temp.Binomial(int64(job.a), int64(job.b))
}
https://play.golang.org/p/CcEGsa1fLh
You should also use a WaitGroup, rather than rely on counting tokens in a channel yourself.

Slice append from channels

I want to create slice and add values returned from channel.
Below is the code I tried but could not able to solve it.
I have to send address of the slice, but I am not able to figure out how :(
package main
import "fmt"
import "time"
func sendvalues(cs chan int){
for i:=0;i<10;i++{
cs<-i
}
}
func appendInt(cs chan int, aINt []int)[]*int{
for {
select {
case i := <-cs:
aINt = append(aINt,i)//append returns new type right ?
fmt.Println("slice",aINt)
}
}
}
func main() {
cs := make(chan int)
intSlice := make([]int, 0,10)
fmt.Println("Before",intSlice)
go sendvalues(cs)
go appendInt(cs,intSlice)// I have to pass address here
time.Sleep(999*999999)
fmt.Println("After",intSlice)
}
Your code won't work for two (in fact three) reasons:
append returns a new slice as soon as the capacity is reached.
Thus, the assignment in appendInt will do nothing.
appendInt runs concurrently, therefore:
As long as appendInt does not message main that it is finished,
main does not know when the intSlice has all the values you want.
You have to wait for all goroutines to return at the end of main
Problem 1: Modifying slices in functions
You may know that in Go every value you pass to a function is copied. Reference values,
such as slices, are copied too, but have pointers internally which then point to the original memory location. That means you can modify the elements of a slice in a function. What you
can't do is reassigning this value with a new slice as the internal pointer would point to somewhere different. You need pointers for that. Example (Play):
func modify(s *[]int) {
for i:=0; i < 10; i++ {
*s = append(*s, i)
}
}
func main() {
s := []int{1,2,3}
modify(&s)
fmt.Println(s)
}
Problem 2: Synchronizing goroutines
To wait for started goroutines, you can use a sync.WaitGroup. Example (Play):
func modify(wg *sync.WaitGroup, s *[]int) {
defer wg.Done()
for i:=0; i < 10; i++ {
*s = append(*s, i)
}
}
func main() {
wg := &sync.WaitGroup{}
s := []int{1,2,3}
wg.Add(1)
go modify(wg, &s)
wg.Wait()
fmt.Println(s)
}
The example above waits (using wg.Wait()) for modify to finish
(modify calls wg.Done() when finished). If you remove the wg.Wait() call, you will
see why not synchronizing is a problem. Comparison of outputs:
With wg.Wait(): [1 2 3 0 1 2 3 4 5 6 7 8 9]
Without wg.Wait(): [1 2 3]
The main goroutine returns earlier than the modify goroutine which is why you will never
see the modified results. Therefore synchronizing is absolutely necessary.
A good way to communicate the new slice would be to use a channel. You would not need to
use pointers and you would have synchronization. Example (Play):
func modify(res chan []int) {
s := []int{}
for i:=0; i < 10; i++ {
s = append(s, i)
}
res <- s
}
func main() {
c := make(chan []int)
go modify(c)
s := <-c
fmt.Println(s)
}

Resources