Is it thread safe to concurrently read/access an array in go? - go

Like if I have a struct with an array and I want to do something like this
type Paxos struct {
peers []string
}
for _, peer := range px.peers {
\\do stuff
}
My routines/threads will never modify the peers array, just read from it. Peers is an array of server addresses, and servers may fail but that wouldn't affect the peers array (later rpc calls would just fail)

If no writes are involved, concurrent reads are always safe, regardless of the data structure. However, as soon as even a single concurrency-unsafe write to a variable is involved, you need to serialise concurrent access (both writes and reads) to the variable.
Moreover, you can safely write to elements of a slice or an array under the condition that no more than one goroutine write to a given element.
For instance, if you run the following programme with the race detector on, it's likely to report a race condition, because multiple goroutines concurrently modify variable results without precautions:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
var results []int
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results = append(results, square(i))
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}
However, the following programme contains no such no synchronization bug, because each element of the slice is modified by a single goroutine:
package main
import (
"fmt"
"sync"
)
func main() {
const n = 8
results := make([]int, n)
var wg sync.WaitGroup
wg.Add(n)
for i := 0; i < n; i++ {
i := i
go func() {
defer wg.Done()
results[i] = square(i)
}()
}
wg.Wait()
fmt.Println(results)
}
func square(i int) int {
return i * i
}

Yes, reads are thread-safe in Go and virtually all other languages. You're just looking up an address in memory and seeing what is there. If nothing is attempting to modify that memory, then you can have as many concurrent reads as you'd like.

Related

Issue with goroutine and Waitgroup

I am trying to iterate a loop and call go routine on an anonymous function and adding a waitgroup on each iteration. And passing a string to same anonymous function and appending the value to slice a. Since I am looping 10000 times length of the slice is expected to be 10000. But I see random numbers. Not sure what is the issue. Can anyone help me fix this problem?
Here is my code snippet
import (
"fmt"
"sync"
)
func main() {
var wg = new(sync.WaitGroup)
var a []string
for i := 0; i <= 10000; i++ {
wg.Add(1)
go func(s string) {
a = append(a, s)
wg.Done()
}("MaxPayne")
}
wg.Wait()
fmt.Println(len(a))
}
Notice how appending a slice, you actually make a new slice, and then assign it back to the slice variable. So you have un-controlled concurrent writing to the variable a. Concurrent writing to the same value is not safe in Go (and most languages). In order to make it safe, you can serialize the writes with a mutex.
Try:
var lock sync.Mutex
var a []string
and
lock.Lock()
a = append(a, s)
lock.Unlock()
For more information about how a mutex works, see the tour and the sync package.
Here is a pattern to achieve a similar result, but without needing a mutex and still being safe.
package main
import (
"fmt"
"sync"
)
func main() {
const sliceSize = 10000
var wg = new(sync.WaitGroup)
var a = make([]string, sliceSize)
for i := 0; i < sliceSize; i++ {
wg.Add(1)
go func(s string, index int) {
a[index] = s
wg.Done()
}("MaxPayne", i)
}
wg.Wait()
}
This isn't exactly the same as your other program, but here's what it does.
Create a slice that already has the desired size of 10,000 (each element is an empty string at this point)
For each number 0...9999, create a new goroutine that is given a specific index to write a specific string into
After all goroutines have exited and the waitgroup is done waiting, then we know that each index of the slice has successfully been filled.
The memory access is now safe even without a mutex, because each goroutine is only writing to it's respective index (and each goroutine gets a unique index). Therefore, none of these concurrent memory writes conflict with each other. After initially creating the slice with the desired size, the variable a itself doesn't need to be assigned to again, so the original memory race is eliminated.

Unsuccessful attempts at implementing concurrency

I'm having difficulty getting go concurrency to work correctly. I'm working with data loaded from an XML Data Source. Once I load the data into memory, i loop through the XML elements and perform an operation. The code prior to the concurrency addition has been tested and functional, and I don't believe it has any influence on the concurrency addition. I have 2 failed attempts at concurrency implementations, both with different outputs. I used locking because i dont want to enter a race condition.
For this implementation, it never enters the goroutine.
var mu sync.Mutex
// length is 197K
for i:=0;i<len(listings.Listings);i++{
go func(){
mu.Lock()
// code execution (tested prior to adding concurrency and locking)
mu.Unlock()
}()
}
For this implementation using waitGroups, a runtime out of memory occurs
var mu sync.Mutex
var wg sync.WaitGroup
// length is 197K
for i:=0;i<len(listings.Listings);i++{
wg.Add(1)
go func(){
mu.Lock()
// code execution (tested prior to adding concurrency and locking and wait group)
wg.Done()
mu.Unlock()
}()
}
wg.Wait()
I'm not really sure what's going on and could use some assistance.
You don't need Mutex here if you want to make it concurrent
197K goroitines are a lot, try lower amount of goroutines. You can accomplish it by creating N goroutines, when each of them is listening to the same channel.
https://play.golang.org/p/s4e0YyHdyPq
package main
import (
"fmt"
"sync"
)
type Listing struct{}
func main() {
var (
wg sync.WaitGroup
concurrency = 100
)
c := make(chan Listing)
wg.Add(concurrency)
for i := 0; i < concurrency; i++ {
go func(ci <-chan Listing) {
for l := range ci {
// code, l is a single Listing
fmt.Printf("%v", l)
}
wg.Done()
}(c)
}
// replace with your var
listings := []Listing{Listing{}}
for _, l := range listings {
c <- l
}
close(c)
wg.Wait()
}

which goroutine is executed when use sleep in go?

i am new in golang recently. i have a question about goroutine when use time.sleep function. here the code.
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
var input string
fmt.Scanln(&input)
}
and output is
go1
go2
go1
go2
go1
go2
i think when count function is execute sleep function, go1 and go2 will execute in random sequence. so the out put maybe like
go1
go1
go2
go2
go2
go1
when i delete the sleep code in count function. the result as i supposed , it's random.
i am stucked in this issue.
thanks.
First thing to notice is that there are three go routines and all of them are independent of each other. The only thing that combines the two go routines with count routine is the channel on which both go routines are sending the values.
time.Sleep is not making the go routines synchronous. On using time.Sleep you are actually letting the count go routine to wait for that long which let other go routine to send the value on the channel which is available for the count go routine to be able to receive.
One more thing that you can do to check it is increase the number of CPU's which will give you the random result.
func GOMAXPROCS(n int) int
GOMAXPROCS sets the maximum number of CPUs that can be executing
simultaneously and returns the previous setting. If n < 1, it does not
change the current setting. The number of logical CPUs on the local
machine can be queried with NumCPU. This call will go away when the
scheduler improves.
The number of CPUs available simultaneously to executing goroutines is
controlled by the GOMAXPROCS shell environment variable, whose default
value is the number of CPU cores available. Programs with the
potential for parallel execution should therefore achieve it by
default on a multiple-CPU machine. To change the number of parallel
CPUs to use, set the environment variable or use the similarly-named
function of the runtime package to configure the run-time support to
utilize a different number of threads. Setting it to 1 eliminates the
possibility of true parallelism, forcing independent goroutines to
take turns executing.
Considering the part where output of the go routine is random, it is always random. But channels most probably work in queue which is FIFO(first in first out) as it depends on which value is available on the channel to b received. So whichever be the value available on the channel to be sent is letting the count go routine to wait and print that value.
Take for an example even if I am using time.Sleep the output is random:
package main
import (
"fmt"
"time"
)
func go1(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go1:", i)
}
}
func go2(msg_chan chan string) {
for i := 0; i < 10; i++ {
msg_chan <- fmt.Sprintf("%s%d", "go2:", i)
}
}
func count(msg_chan chan string) {
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
go go1(c)
go go2(c)
go count(c)
time.Sleep(time.Second * 20)
fmt.Println("finished")
}
This sometimes leads to race condition which is why we use synchronization either using channels or wait.groups.
package main
import (
"fmt"
"sync"
"time"
)
var wg sync.WaitGroup
func go1(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go1"
}
}
func go2(msg_chan chan string) {
defer wg.Done()
for {
msg_chan <- "go2"
}
}
func count(msg_chan chan string) {
defer wg.Done()
for {
msg := <-msg_chan
fmt.Println(msg)
time.Sleep(time.Second * 1)
}
}
func main() {
var c chan string
c = make(chan string)
wg.Add(1)
go go1(c)
wg.Add(1)
go go2(c)
wg.Add(1)
go count(c)
wg.Wait()
fmt.Println("finished")
}
Now coming to the part where you are using never ending for loop to send the values on a channel. So if you remove the time.Sleep your process will hang since the loop will never stop to send the values on the channel.

length of slice vary while already using waitgroup

I have a hard time understanding concurrency/paralel. in my code I made a loop of 5 cycle. Inside of the loop I added the wg.Add(1), in total I have 5 Adds. Here's the code:
package main
import (
"fmt"
"sync"
)
func main() {
var list []int
wg := sync.WaitGroup{}
for i := 0; i < 5; i++ {
wg.Add(1)
go func(c *[]int, i int) {
*c = append(*c, i)
wg.Done()
}(&list, i)
}
wg.Wait()
fmt.Println(len(list))
}
The main func waits until all the goroutines finish but when I tried to print the length of slice I get random results. ex (1,3,etc) is there something that is missing for it to get the expected result ie 5 ?
is there something that is missing for it to get the expected result ie 5 ?
Yes, proper synchronization. If multiple goroutines access the same variable where at least one of them is a write, you need explicit synchronization.
Your example can be "secured" with a single mutex:
var list []int
wg := sync.WaitGroup{}
mu := &sync.Mutex{} // A mutex
for i := 0; i < 5; i++ {
wg.Add(1)
go func(c *[]int, i int) {
mu.Lock() // Must lock before accessing shared resource
*c = append(*c, i)
mu.Unlock() // Unlock when we're done with it
wg.Done()
}(&list, i)
}
wg.Wait()
fmt.Println(len(list))
This will always print 5.
Note: the same slice is read at the end to prints its length, yet we are not using the mutex there. This is because the use of waitgroup ensures that we can only get to that point after all goroutines that modify it have completed their job, so data race cannot occur there. But in general both reads and writes have to be synchronized.
See possible duplicates:
go routine not collecting all objects from channel
Server instances with multiple users
Why does this code cause data race?
How safe are Golang maps for concurrent Read/Write operations?
golang struct concurrent read and write without Lock is also running ok?
See related questions:
Can I concurrently write different slice elements
If I am using channels properly should I need to use mutexes?
Is it safe to read a function pointer concurrently without a lock?
Concurrent access to maps with 'range' in Go

How to wait for all goroutines to finish without using time.Sleep?

This code selects all xml files in the same folder, as the invoked executable and asynchronously applies processing to each result in the callback method (in the example below, just the name of the file is printed out).
How do I avoid using the sleep method to keep the main method from exiting? I have problems wrapping my head around channels (I assume that's what it takes, to synchronize the results) so any help is appreciated!
package main
import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)
func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}
func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)
eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})
// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
You can use sync.WaitGroup. Quoting the linked example:
package main
import (
"net/http"
"sync"
)
func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}
WaitGroups are definitely the canonical way to do this. Just for the sake of completeness, though, here's the solution that was commonly used before WaitGroups were introduced. The basic idea is to use a channel to say "I'm done," and have the main goroutine wait until each spawned routine has reported its completion.
func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}
// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}
sync.WaitGroup can help you here.
package main
import (
"fmt"
"sync"
"time"
)
func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()
time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}
func main() {
var wg sync.WaitGroup
for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}
Although sync.waitGroup (wg) is the canonical way forward, it does require you do at least some of your wg.Add calls before you wg.Wait for all to complete. This may not be feasible for simple things like a web crawler, where you don't know the number of recursive calls beforehand and it takes a while to retrieve the data that drives the wg.Add calls. After all, you need to load and parse the first page before you know the size of the first batch of child pages.
I wrote a solution using channels, avoiding waitGroup in my solution the the Tour of Go - web crawler exercise. Each time one or more go-routines are started, you send the number to the children channel. Each time a go routine is about to complete, you send a 1 to the done channel. When the sum of children equals the sum of done, we are done.
My only remaining concern is the hard-coded size of the the results channel, but that is a (current) Go limitation.
// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results). Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results chan string
children chan int
done chan int
}
// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}
// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}
// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}
// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children = %v\n", done, children)
close(rc.results)
return
}
}
}
}
Full source code for the solution
Here is a solution that employs WaitGroup.
First, define 2 utility methods:
package util
import (
"sync"
)
var allNodesWaitGroup sync.WaitGroup
func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}
func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}
Then, replace the invocation of callback:
go callback(fileName)
With a call to your utility function:
util.GoNode(func() { callback(fileName) })
Last step, add this line at the end of your main, instead of your sleep. This will make sure the main thread is waiting for all routines to finish before the program can stop.
func main() {
// ...
util.WaitForAllNodes()
}

Resources