A very simple and usual case in golang as below, but got result not expected.
package main
import (
"fmt"
"time"
)
func main() {
consumer(generator())
for {
time.Sleep(time.Duration(time.Second))
}
}
// simple generator through channel
func generator() <-chan []byte {
ret := make(chan []byte)
go func() {
// make buf outside of loop, and result is not expected
var ch = byte('A')
count := 0
buf := make([]byte, 1)
for {
if count > 10 {
return
}
// make buf inside loop, and result is expected
// buf := make([]byte, 1)
buf[0] = ch
ret <- buf
ch++
count++
// time.Sleep(time.Duration(time.Second))
}
}()
return ret
}
// simple consumer through channel
func consumer(recv <-chan []byte) {
go func() {
for buf := range recv {
fmt.Println("received:" + string(buf[0]))
}
}()
}
output:
received:A
received:B
received:D
received:D
received:F
received:F
received:H
received:H
received:J
received:J
received:K
In generator, if put the buf variable inside for loop, result is what I expected:
received:A
received:B
received:C
received:D
received:E
received:F
received:G
received:H
received:I
received:J
received:K
I am thinking even buf is outside for loop and not changed always, after we write it to channe, receiver will read out it until next write can happen, so its' content should not be override, but looks like golang behaviors not in this way, what wrong for happened here?
Problem: your code contains a data race
Save your your program in a file named main.go; then run it with the race detector: go run -race main.go. You should see something like the following:
$ go run -race main.go
received:A
==================
WARNING: DATA RACE
Write at 0x00c000180000 by goroutine 7:
main.generator.func1()
/redacted/main.go:29 +0x8c
Previous read at 0x00c000180000 by goroutine 8:
main.consumer.func1()
/redacted/main.go:43 +0x55
The race detector tells you your program contains a data race because two goroutines are writing and reading to some shared memory without synchronisation:
the anonymous function launched as a goroutine in your generator function updates its local variable named buf at line 29;
the anonymous function launched as a goroutine in your consumer function reads from its local variable named buf at line 43.
The data race stems from the conjunction of two things:
Although local variable buf in consumer is just a copy of the homonymous local variable in generator, those slice variables are coupled because they refer to the same underlying array.
See [the relevant section of the language specification] (https://golang.org/ref/spec#Slice_types):
A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array [...]
Operations on slices are not concurrency-safe and require proper synchronisation if performed concurrently (i.e. from multiple goroutines at the same time).
What your code displays is a typical case of aliasing. You should better familiarise yourself with how slices work.
Solution
You could eliminate the data race by using a one-byte array ([1]byte) instead of a slice, but arrays are quite inflexible in Go. Whether you really need to use a slice of bytes at all here is unclear. Since you're effectively only sending one byte at a time to the channel, why not simply use a chan byte rather than a chan []byte?
Other improvements unrelated to the data race include:
modifying the API of your two functions to make them synchronous (and therefore, easier to reason about);
simplifying the generator logic and closing the channel so that main can actually terminate;
simplifying the consumer logic and not spawning a goroutine for it.
package main
import "fmt"
func main() {
ch := make(chan byte)
go generator(ch)
consumer(ch)
}
func generator(ch chan<- byte) {
var c byte = 'A'
for i := 0; i < 10; i++ {
ch <- c
c++
}
close(ch)
}
func consumer(ch <-chan byte) {
for c := range ch {
fmt.Printf("received: %c\n", c)
}
}
The case is very simple. Both threads have ownership of the buffer and so channel does not guarantee synchronization. While consumer is reading the channel, generator is fast enough to modify the buffer so this char skip happens. to fix this you have to introduce another channel (that will send buffer back) or pass a copy of buffer.
Related
Assuming I have a bunch of files to deal with(say 1000 or more), first they should be processed by function A(), function A() will generate a file, then this file will be processed by B().
If we do it one by one, that's too slow, so I'm thinking process 5 files at a time using goroutine(we can not process too much at a time cause the CPU cannot bear).
I'm a newbie in golang, I'm not sure if my thought is correct, I think the function A() is a producer and the function B() is a consumer, function B() will deal with the file that produced by function A(), and I wrote some code below, forgive me, I really don't know how to write the code, can anyone give me a help? Thank you in advance!
package main
import "fmt"
var Box = make(chan string, 1024)
func A(file string) {
fmt.Println(file, "is processing in func A()...")
fileGenByA := "/path/to/fileGenByA1"
Box <- fileGenByA
}
func B(file string) {
fmt.Println(file, "is processing in func B()...")
}
func main() {
// assuming that this is the file list read from a directory
fileList := []string{
"/path/to/file1",
"/path/to/file2",
"/path/to/file3",
}
// it seems I can't do this, because fileList may have 1000 or more file
for _, v := range fileList {
go A(v)
}
// can I do this?
for file := range Box {
go B(file)
}
}
Update:
sorry, maybe I haven’t made myself clear, actually the file generated by function A() is stored in the hard disk(generated by a command line tool, I just simple execute it using exec.Command()), not in a variable(the memory), so it doesn't have to be passed to function B() immediately.
I think there are 2 approach:
approach1
approach2
Actually I prefer approach2, as you can see, the first B() doesn't have to process the file1GenByA, it's the same for B() to process any file in the box, because file1GenByA may generated after file2GenByA(maybe the file is larger so it takes more time).
You could spawn 5 goroutines that read from a work channel. That way you have at all times 5 goroutines running and don't need to batch them so that you have to wait until 5 are finished to start the next 5.
func main() {
stack := []string{"a", "b", "c", "d", "e", "f", "g", "h"}
work := make(chan string)
results := make(chan string)
// create worker 5 goroutines
wg := sync.WaitGroup{}
for i := 0; i < 5; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for s := range work {
results <- B(A(s))
}
}()
}
// send the work to the workers
// this happens in a goroutine in order
// to not block the main function, once
// all 5 workers are busy
go func() {
for _, s := range stack {
// could read the file from disk
// here and pass a pointer to the file
work <- s
}
// close the work channel after
// all the work has been send
close(work)
// wait for the workers to finish
// then close the results channel
wg.Wait()
close(results)
}()
// collect the results
// the iteration stops if the results
// channel is closed and the last value
// has been received
for result := range results {
// could write the file to disk
fmt.Println(result)
}
}
https://play.golang.com/p/K-KVX4LEEoK
you're halfway there. There's a few things you need to fix:
your program deadlocks because nothing closes Box, so the main function can never get done rangeing over it.
You aren't waiting for your goroutines to finish, and there than 5 goroutines. (The solutions to these are too intertwined to describe them separately)
1. Deadlock
fatal error: all goroutines are asleep - deadlock!
goroutine 1 [chan receive]:
main.main()
When you range over a channel, you read each value from the channel until it is both closed and empty. Since you never close the channel, the range over that channel can never complete, and the program can never finish.
This is a fairly easy problem to solve in your case: we just need to close the channel when we know there will be no more writes to the channel.
for _, v := range fileList {
go A(v)
}
close(Box)
Keep in mind that closeing a channel doesn't stop it from being read, only written. Now consumers can distinguish between an empty channel that may receive more data in the future, and an empty channel that will never receive more data.
Once you add the close(Box), the program doesn't deadlock anymore, but it still doesn't work.
2. Too Many Goroutines and not waiting for them to complete
To run a certain maximum number of concurrent executions, instead of creating a goroutine for each input, create the goroutines in a "worker pool":
Create a channel to pass the workers their work
Create a channel for the goroutines to return their results, if any
Start the number of goroutines you want
Start at least one additional goroutine to either dispatch work or collect the result, so you don't have to try doing both from the main goroutine
use a sync.WaitGroup to wait for all data to be processed
close the channels to signal to the workers and the results collector that their channels are done being filled.
Before we get into the implementation, let's talk aobut how A and B interact.
first they should be processed by function A(), function A() will generate a file, then this file will be processed by B().
A() and B() must, then, execute serially. They can still pass their data through a channel, but since their execution must be serial, it does nothing for you. Simpler is to run them sequentially in the workers. For that, we'll need to change A() to either call B, or to return the path for B and the worker can call. I choose the latter.
func A(file string) string {
fmt.Println(file, "is processing in func A()...")
fileGenByA := "/path/to/fileGenByA1"
return fileGenByA
}
Before we write our worker function, we also must consider the result of B. Currently, B returns nothing. In the real world, unless B() cannot fail, you would at least want to either return the error, or at least panic. I'll skip over collecting results for now.
Now we can write our worker function.
func worker(wg *sync.WaitGroup, incoming <-chan string) {
defer wg.Done()
for file := range incoming {
B(A(file))
}
}
Now all we have to do is start 5 such workers, write the incoming files to the channel, close it, and wg.Wait() for the workers to complete.
incoming_work := make(chan string)
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go worker(&wg, incoming_work)
}
for _, v := range fileList {
incoming_work <- v
}
close(incoming_work)
wg.Wait()
Full example at https://go.dev/play/p/A1H4ArD2LD8
Returning Results.
It's all well and good to be able to kick off goroutines and wait for them to complete. But what if you need results back from your goroutines? In all but the simplest of cases, you would at least want to know if files failed to process so you could investigate the errors.
We have only 5 workers, but we have many files, so we have many results. Each worker will have to return several results. So, another channel. It's usually worth defining a struct for your return:
type result struct {
file string
err error
}
This tells us not just whether there was an error but also clearly defines which file from which the error resulted.
How will we test an error case in our current code? In your example, B always gets the same value from A. If we add A's incoming file name to the path it passes to B, we can mock an error based on a substring. My mocked error will be that file3 fails.
func A(file string) string {
fmt.Println(file, "is processing in func A()...")
fileGenByA := "/path/to/fileGenByA1/" + file
return fileGenByA
}
func B(file string) (r result) {
r.file = file
fmt.Println(file, "is processing in func B()...")
if strings.Contains(file, "file3") {
r.err = fmt.Errorf("Test error")
}
return
}
Our workers will be sending results, but we need to collect them somewhere. main() is busy dispatching work to the workers, blocking on its write to incoming_work when the workers are all busy. So the simplest place to collect the results is another goroutine. Our results collector goroutine has to read from a results channel, print out errors for debugging, and the return the total number of failures so our program can return a final exit status indicating overall success or failure.
failures_chan := make(chan int)
go func() {
var failures int
for result := range results {
if result.err != nil {
failures++
fmt.Printf("File %s failed: %s", result.file, result.err.Error())
}
}
failures_chan <- failures
}()
Now we have another channel to close, and it's important we close it after all workers are done. So we close(results) after we wg.Wait() for the workers.
close(incoming_work)
wg.Wait()
close(results)
if failures := <-failures_chan; failures > 0 {
os.Exit(1)
}
Putting all that together, we end up with this code:
package main
import (
"fmt"
"os"
"strings"
"sync"
)
func A(file string) string {
fmt.Println(file, "is processing in func A()...")
fileGenByA := "/path/to/fileGenByA1/" + file
return fileGenByA
}
func B(file string) (r result) {
r.file = file
fmt.Println(file, "is processing in func B()...")
if strings.Contains(file, "file3") {
r.err = fmt.Errorf("Test error")
}
return
}
func worker(wg *sync.WaitGroup, incoming <-chan string, results chan<- result) {
defer wg.Done()
for file := range incoming {
results <- B(A(file))
}
}
type result struct {
file string
err error
}
func main() {
// assuming that this is the file list read from a directory
fileList := []string{
"/path/to/file1",
"/path/to/file2",
"/path/to/file3",
}
incoming_work := make(chan string)
results := make(chan result)
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go worker(&wg, incoming_work, results)
}
failures_chan := make(chan int)
go func() {
var failures int
for result := range results {
if result.err != nil {
failures++
fmt.Printf("File %s failed: %s", result.file, result.err.Error())
}
}
failures_chan <- failures
}()
for _, v := range fileList {
incoming_work <- v
}
close(incoming_work)
wg.Wait()
close(results)
if failures := <-failures_chan; failures > 0 {
os.Exit(1)
}
}
And when we run it, we get:
/path/to/file1 is processing in func A()...
/path/to/fileGenByA1//path/to/file1 is processing in func B()...
/path/to/file2 is processing in func A()...
/path/to/fileGenByA1//path/to/file2 is processing in func B()...
/path/to/file3 is processing in func A()...
/path/to/fileGenByA1//path/to/file3 is processing in func B()...
File /path/to/fileGenByA1//path/to/file3 failed: Test error
Program exited.
A final thought: buffered channels.
There is nothing wrong with buffered channels. Especially if you know the overall size of incoming work and results, buffered channels can obviate the results collector goroutine because you can allocate a buffered channel big enough to hold all results. However, I think it's more straightforward to understand this pattern if the channels are unbuffered. The key takeaway is that you don't need to know the number of incoming or outgoing results, which could indeed be different numbers or based on something that can't be predetermined.
I am trying to iterate a loop and call go routine on an anonymous function and adding a waitgroup on each iteration. And passing a string to same anonymous function and appending the value to slice a. Since I am looping 10000 times length of the slice is expected to be 10000. But I see random numbers. Not sure what is the issue. Can anyone help me fix this problem?
Here is my code snippet
import (
"fmt"
"sync"
)
func main() {
var wg = new(sync.WaitGroup)
var a []string
for i := 0; i <= 10000; i++ {
wg.Add(1)
go func(s string) {
a = append(a, s)
wg.Done()
}("MaxPayne")
}
wg.Wait()
fmt.Println(len(a))
}
Notice how appending a slice, you actually make a new slice, and then assign it back to the slice variable. So you have un-controlled concurrent writing to the variable a. Concurrent writing to the same value is not safe in Go (and most languages). In order to make it safe, you can serialize the writes with a mutex.
Try:
var lock sync.Mutex
var a []string
and
lock.Lock()
a = append(a, s)
lock.Unlock()
For more information about how a mutex works, see the tour and the sync package.
Here is a pattern to achieve a similar result, but without needing a mutex and still being safe.
package main
import (
"fmt"
"sync"
)
func main() {
const sliceSize = 10000
var wg = new(sync.WaitGroup)
var a = make([]string, sliceSize)
for i := 0; i < sliceSize; i++ {
wg.Add(1)
go func(s string, index int) {
a[index] = s
wg.Done()
}("MaxPayne", i)
}
wg.Wait()
}
This isn't exactly the same as your other program, but here's what it does.
Create a slice that already has the desired size of 10,000 (each element is an empty string at this point)
For each number 0...9999, create a new goroutine that is given a specific index to write a specific string into
After all goroutines have exited and the waitgroup is done waiting, then we know that each index of the slice has successfully been filled.
The memory access is now safe even without a mutex, because each goroutine is only writing to it's respective index (and each goroutine gets a unique index). Therefore, none of these concurrent memory writes conflict with each other. After initially creating the slice with the desired size, the variable a itself doesn't need to be assigned to again, so the original memory race is eliminated.
This might be a rookies mistake. I have a slice with a string value and a map of channels. For each string in the slice, a channel is created and a map entry is created for it, with the string as key.
I watch the channels and pass a value to one of them, which is never found.
package main
import (
"fmt"
"time"
)
type TestStruct struct {
Test string
}
var channelsMap map[string](chan *TestStruct)
func main() {
stringsSlice := []string{"value1"}
channelsMap := make(map[string](chan *TestStruct))
for _, value := range stringsSlice {
channelsMap[value] = make(chan *TestStruct, 1)
go watchChannel(value)
}
<-time.After(3 * time.Second)
testStruct := new(TestStruct)
testStruct.Test = "Hello!"
channelsMap["value1"] <- testStruct
<-time.After(3 * time.Second)
fmt.Println("Program ended")
}
func watchChannel(channelMapKey string) {
fmt.Println("Watching channel: " + channelMapKey)
for channelValue := range channelsMap[channelMapKey] {
fmt.Printf("Channel '%s' used. Passed value: '%s'\n", channelMapKey, channelValue.Test)
}
}
Playground link: https://play.golang.org/p/IbucTqMjdGO
Output:
Watching channel: value1
Program ended
How do I execute something when the message is fed into the channel?
There are many problems with your approach.
The first one is that you're redeclaring ("shadowing") the global
variable channelsMap in your main function.
(Had you completed at least some
most basic intro to Go, you should have had no such problem.)
This means that your watchChannel (actually, all the goroutines which execute that function) read the global channelsMap while your main function writes to its local channelsMap.
What happens next, is as follows:
The range statement
in the watchChannel has a simple
map lookup expression as its source—channelsMap[channelMapKey].
In Go, this form of map lookup
never fails, but if the map has no such key (or if the map is not initialized, that is, it's nil), the so-called
"zero value"
of the appropriate type is returned.
Since the global channelsMap is always empty, any call to watchChannel performs a map lookup which always returns
the zero value of type chan *TestStruct.
The zero value for any channel is nil.
The range statement executed over a nil channel
produces zero iterations.
In other words, the for loop in watchChannel always executes
zero times.
The more complex problem, still, is not shadowing of the global variable but rather the complete absense of synchronization between the goroutines. You're using "sleeping" as a sort of band-aid in an attempt to perform implicit synchronization between goroutines
but while this does appear to be okay judged by so-called
"common sense", it's not going to work in practice for two
reasons:
Sleeping is always a naïve approach to synchronization as it solely depens of the fact all the goroutines will run relatively freely and uncontended. This is far from being true in many (if not most) production settings and hence is always the reason for subtle bugs. Don't ever do that again, please.
Nothing in the Go memory model
says that waiting against wall-clock timing is considered by the runtime as establishing the order on how execution of different goroutines relate to each other.
There exist various ways to synchronize execution between goroutines. Basically they amount to sends and receives over channels and using the types provided by the sync package.
In your particular case the simplest approach is probably using the sync.WaitGroup type.
Here is what we would
have after fixing the problems explained above:
- Initialize the map variable right at the point of its
definition and not mess with it in main.
- Use sync.WaitGroup to make main properly wait for all
the goroutines it spawned to singal they're done:
package main
import (
"fmt"
"sync"
)
type TestStruct struct {
Test string
}
var channelsMap = make(map[string](chan *TestStruct))
func main() {
stringsSlice := []string{"value1"}
var wg sync.WaitGroup
wg.Add(len(stringsSlice))
for _, value := range stringsSlice {
channelsMap[value] = make(chan *TestStruct, 1)
go watchChannel(value, &wg)
}
testStruct := new(TestStruct)
testStruct.Test = "Hello!"
channelsMap["value1"] <- testStruct
wg.Wait()
fmt.Println("Program ended")
}
func watchChannel(channelMapKey string, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Println("Watching channel: " + channelMapKey)
for channelValue := range channelsMap[channelMapKey] {
fmt.Printf("Channel '%s' used. Passed value: '%s'\n", channelMapKey, channelValue.Test)
}
}
The next two problems with your code become apparent once we will
have fixed the former two—after you make the "watcher" goroutines
use the same map variable as the goroutine running main, and
make the latter properly wait for the watchers:
There is a data race
over the map variable between the
code which updates the map after the for loop spawning the
watcher goroutines ended and the code which accesses this
variable in all the watcher goroutines.
There is a deadlock
between the watcher goroutines and the main goroutine which waits for them to complete.
The reason for the deadlock is that the watcher goroutines
never receive any signal they have to quit processing and
hence are stuck forever trying to read from their respective
channels.
The ways to fix these two new problems are simple but they
might actually "break" your original idea of structuring
your code.
First, I'd remove the data race by simply making the watchers
not access the map variable. As you can see, each call to
watchChannel receives a single value to use as the key to
read a value off the shared map, and hence each watcher always
reads a single value exactly once during its run time.
The code would become much clearer if we remove this extra
map access altogether and instead pass the appropriate channel
value directly to each watcher.
A nice byproduct of this is that we do not need a global
map variable anymore.
Here's what we'll get:
package main
import (
"fmt"
"sync"
)
type TestStruct struct {
Test string
}
func main() {
stringsSlice := []string{"value1"}
channelsMap := make(map[string](chan *TestStruct))
var wg sync.WaitGroup
wg.Add(len(stringsSlice))
for _, value := range stringsSlice {
channelsMap[value] = make(chan *TestStruct, 1)
go watchChannel(value, channelsMap[value], &wg)
}
testStruct := new(TestStruct)
testStruct.Test = "Hello!"
channelsMap["value1"] <- testStruct
wg.Wait()
fmt.Println("Program ended")
}
func watchChannel(channelMapKey string, ch <-chan *TestStruct, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Println("Watching channel: " + channelMapKey)
for channelValue := range ch {
fmt.Printf("Channel '%s' used. Passed value: '%s'\n", channelMapKey, channelValue.Test)
}
}
Okay, we still have the deadlock.
There are multiple approaches to solving this but they depend
on the actual circumstances, and with this toy example, any
attempt to iterate over at least a subset of them would just
muddle the waters.
Instead, let's employ the simplest one for this case: closing
a channel makes any pending receive operation on it immediately
unblock and produce the zero value for the channel's type.
For a channel being iterated over using the range statement
it simply means the stamement terminates without producing any
value from the channel.
In other words, let's just close all the channels to unblock
the range statements being run by the watcher goroutines
and then wait for these goroutines to report their completion via the wait group.
To not make the answer overly long, I also added programmatic initialization of the string slice to make the example more interesting by having multiple watchers—not just a single one—actually do useful work:
package main
import (
"fmt"
"sync"
)
type TestStruct struct {
Test string
}
func main() {
var stringsSlice []string
channelsMap := make(map[string](chan *TestStruct))
for i := 1; i <= 10; i++ {
stringsSlice = append(stringsSlice, fmt.Sprintf("value%d", i))
}
var wg sync.WaitGroup
wg.Add(len(stringsSlice))
for _, value := range stringsSlice {
channelsMap[value] = make(chan *TestStruct, 1)
go watchChannel(value, channelsMap[value], &wg)
}
for _, value := range stringsSlice {
testStruct := new(TestStruct)
testStruct.Test = fmt.Sprint("Hello! ", value)
channelsMap[value] <- testStruct
}
for _, ch := range channelsMap {
close(ch)
}
wg.Wait()
fmt.Println("Program ended")
}
func watchChannel(channelMapKey string, ch <-chan *TestStruct, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Println("Watching channel: " + channelMapKey)
for channelValue := range ch {
fmt.Printf("Channel '%s' used. Passed value: '%s'\n", channelMapKey, channelValue.Test)
}
}
Playground link.
As you can see, there are things you should actually learn
about in way more greater detail before embarking on working with
concurrency.
I'd recommend to proceed in the following order:
The Go tour would make you accustomed with the bare bones of concurrency.
The Go Programming Language has two chapters dedicated to providing the readers with a gentle introduction with tackling concurrency both using channels and the types from the sync package.
Concurrency In Go goes on with presenting more hard-core details of how one deals with concurrency in Go, including advanced topics approaching the real-world problems concurrent programs face in production—such as ways to rate-limit incoming requests.
The shadowing in main of channelsMap mentioned above was a critical bug, but aside from that, the program was playing "Russian roulette" with the calls to time.After so that main wouldn't finish before the watcher goroutines did. This is unstable and unreliable, so I recommend the following approach using a channel to signal when all watcher goroutines are done:
package main
import (
"fmt"
)
type TestStruct struct {
Test string
}
var channelsMap map[string](chan *TestStruct)
func main() {
stringsSlice := []string{"value1", "value2", "value3"}
structsSlice := []TestStruct{
{"Hello1"},
{"Hello2"},
{"Hello3"},
}
channelsMap = make(map[string](chan *TestStruct))
// Signal channel to wait for watcher goroutines.
done := make(chan struct{})
for _, s := range stringsSlice {
channelsMap[s] = make(chan *TestStruct)
// Give watcher goroutines the signal channel.
go watchChannel(s, done)
}
for _, ts := range structsSlice {
for _, s := range stringsSlice {
channelsMap[s] <- &ts
}
}
// Close the channels so watcher goroutines can finish.
for _, s := range stringsSlice {
close(channelsMap[s])
}
// Wait for all watcher goroutines to finish.
for range stringsSlice {
<-done
}
// Now we're really done!
fmt.Println("Program ended")
}
func watchChannel(channelMapKey string, done chan<- struct{}) {
fmt.Println("Watching channel: " + channelMapKey)
for channelValue := range channelsMap[channelMapKey] {
fmt.Printf("Channel '%s' used. Passed value: '%s'\n", channelMapKey, channelValue.Test)
}
done <- struct{}{}
}
(Go Playground link: https://play.golang.org/p/eP57Ru44-NW)
Of importance is the use of the done channel to let watcher goroutines signal that they're finished to main. Another critical part is the closing of the channels once you're done with them. If you don't close them, the range loops in the watcher goroutines will never end, waiting forever. Once you close the channel, the range loop exits and the watcher goruoutine can send on the done channel, signaling that it has finished working.
Finally, back in main, you have to receive on the done channel once for each watcher goroutine you created. Since the number of watcher goroutines is equal to the number of items in stringsSlice, you simply range over stringsSlice to receive the correct amount of times from the done channel. Once that's finished, the main function can exit with a guarantee that all watchers have finished.
I have a hard time understanding concurrency/paralel. in my code I made a loop of 5 cycle. Inside of the loop I added the wg.Add(1), in total I have 5 Adds. Here's the code:
package main
import (
"fmt"
"sync"
)
func main() {
var list []int
wg := sync.WaitGroup{}
for i := 0; i < 5; i++ {
wg.Add(1)
go func(c *[]int, i int) {
*c = append(*c, i)
wg.Done()
}(&list, i)
}
wg.Wait()
fmt.Println(len(list))
}
The main func waits until all the goroutines finish but when I tried to print the length of slice I get random results. ex (1,3,etc) is there something that is missing for it to get the expected result ie 5 ?
is there something that is missing for it to get the expected result ie 5 ?
Yes, proper synchronization. If multiple goroutines access the same variable where at least one of them is a write, you need explicit synchronization.
Your example can be "secured" with a single mutex:
var list []int
wg := sync.WaitGroup{}
mu := &sync.Mutex{} // A mutex
for i := 0; i < 5; i++ {
wg.Add(1)
go func(c *[]int, i int) {
mu.Lock() // Must lock before accessing shared resource
*c = append(*c, i)
mu.Unlock() // Unlock when we're done with it
wg.Done()
}(&list, i)
}
wg.Wait()
fmt.Println(len(list))
This will always print 5.
Note: the same slice is read at the end to prints its length, yet we are not using the mutex there. This is because the use of waitgroup ensures that we can only get to that point after all goroutines that modify it have completed their job, so data race cannot occur there. But in general both reads and writes have to be synchronized.
See possible duplicates:
go routine not collecting all objects from channel
Server instances with multiple users
Why does this code cause data race?
How safe are Golang maps for concurrent Read/Write operations?
golang struct concurrent read and write without Lock is also running ok?
See related questions:
Can I concurrently write different slice elements
If I am using channels properly should I need to use mutexes?
Is it safe to read a function pointer concurrently without a lock?
Concurrent access to maps with 'range' in Go
I have two goroutines go doProcess_A() and go doProcess_B(). Both can call saveData(), a non goroutine method.
should I use go saveData() instead of saveData() ?
Which one is safe?
var waitGroup sync.WaitGroup
func main() {
for i:=0; i<4; i++{
waitGroup.Add(2)
go doProcess_A(i)
go doProcess_B(i)
}
waitGroup.Wait()
}
func doProcess_A(i int) {
// do process
// the result will be stored in data variable
data := "processed data-A as string"
uniqueFileName := "file_A_"+strconv.Itoa(i)+".txt"
saveData(uniqueFileName, data)
waitGroup.Done()
}
func doProcess_B(i int) {
// do some process
// the result will be stored in data variable
data := "processed data-B as string"
uniqueFileName := "file_B_"+strconv.Itoa(i)+".txt"
saveData(uniqueFileName, data)
waitGroup.Done()
}
// write text file
func saveData(fileName ,dataStr string) {
// file name will be unique.
// there is no chance to be same file name
err := ioutil.WriteFile("out/"+fileName, []byte(dataStr), 0644)
if err != nil {
panic(err)
}
}
here, does one goroutine wait for disk file operation when other goroutine is doing?
or, are two goroutine make there own copy of saveData() ?
Goroutines typically don't wait for anything except you explicitly tell them to or if an operation is waiting on a channel or other blocking operation. In your code there is a possibility of a race condition with unwanted results if multiple goroutines call the saveData() function with same filename. It appears that the two goroutines are writing to different files, therefore as long as the filenames are unique, the saveData operation will be safe in a goroutine. It doesn't make sense to use a go routine to call saveData(), don't unnecessarily complicate your life, just call it directly in the doProcess_X functions.
Read more about goroutines and make sure you are using it where it is absolutely necessary. - https://gobyexample.com/goroutines
Note: Just because you are writing a Go application doesn't mean you
should litter it with goroutines. Read and understand what problem it
solves so as to know the best time to use it.