Suppose I have a slice like:
stu = [{"id":"001","name":"A"} {"id":"002", "name":"B"}] and maybe more elements like this. inside of the slice is a long string, I want to use json.unmarshal to parse it.
type Student struct {
Id string `json:"id"`
Name string `json:"name"`
}
studentList := make([]Student,len(stu))
for i, st := range stu {
go func(st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(st)
}
//and a function like this
func getList(stu string)(res Student, error){
var student Student
err := json.Unmarshal(([]byte)(stu), &student)
if err != nil {
return
}
return &student,nil
}
I got the nil result, so I would say the goroutine is out-of-order to execute, so I don't know if it can use studentList[i] to get value.
Here are a few potential issues with your code:
Value of i is probably not what you expect
for i, st := range stu {
go func(st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(st)
}
You kick off a number of goroutines and, within them, reference i. The issue is that i is likely to have changed between the time you started the goroutine and the time the goroutine references it (the for loop runs concurrently to the goroutines it starts). It is quite possible that the for completes before any of the goroutines do meaning that all output will be stored in the last element of studentList (they will overwrite each other so you will end up with one value).
A simple solution is to pass i into the goroutine function (e.g. go func(st string, i int){}(st, i) (this creates a copy). See this for more info.
Output of studentList
You don't say in the question but I suspect you are running fmt.Println(studentList[1] (or similar) immediately after the for loop completes. As mentioned above it's quite possible that none of the goroutines have completed at that point (or they may of, you don't know). Using a WaitGroup is a fairly easy way around this:
var wg sync.WaitGroup
wg.Add(len(stu))
for i, st := range stu {
go func(st string, i int) {
var err error
studentList[i], err = getList(st)
if err != nil {
panic(err)
}
wg.Done()
}(st, i)
}
wg.Wait()
I have corrected these issues in the playground.
not because of this
the goroutine is out-of-order to execute
There are at least two issues here:
you should not use the for loop variable i in goroutine.
multiple goroutines read i, for loop modify i, it's race condition here. to make i works as expected, change the code to:
for i, st := range stu {
go func(i int, st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(i, st)
}
what's more, use sync.WaitGroup to wait for all goroutine.
var wg sync.WaitGroup
for i, st := range stu {
wg.Add(1)
go func(i int, st string){
defer wg.Done()
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(i, st)
}
wg.Wait()
P.S.: (WARNING: maybe not always true)
this line studentList[i], err = getList(st) ,
although it may not cause data race, but it's somehow not friendly to cpu cache line. better avoid writing code like this.
Related
I'm writing a program that reads a list of order numbers in a file called orders.csv and compares it with the other csv files that are present in the folder.
The problem is that it goes into deadlock even using waitgroup and I don't know why.
For some reason stackoverflow says that my post is mostly code, so I have to add this line, because the whole code is necessary if someone wants to help me debug this problem I'm having.
package main
import (
"bufio"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"sync"
)
type Files struct {
filenames []string
}
type Orders struct {
ID []string
}
var ordersFilename string = "orders.csv"
func main() {
var (
ordersFile *os.File
files Files
orders Orders
err error
)
mu := new(sync.Mutex)
wg := &sync.WaitGroup{}
wg.Add(1)
if ordersFile, err = os.Open(ordersFilename); err != nil {
log.Fatalln("Could not open file: " + ordersFilename)
}
orders = getOrderIDs(ordersFile)
files.filenames = getCSVsFromCurrentDir()
var filenamesSize = len(files.filenames)
var ch = make(chan map[string][]string, filenamesSize)
var done = make(chan bool)
for i, filename := range files.filenames {
go func(currentFilename string, ch chan<- map[string][]string, i int, orders Orders, wg *sync.WaitGroup, filenamesSize *int, mu *sync.Mutex, done chan<- bool) {
wg.Add(1)
defer wg.Done()
checkFile(currentFilename, orders, ch)
mu.Lock()
*filenamesSize--
mu.Unlock()
if i == *filenamesSize {
done <- true
close(done)
}
}(filename, ch, i, orders, wg, &filenamesSize, mu, done)
}
select {
case str := <-ch:
fmt.Printf("%+v\n", str)
case <-done:
wg.Done()
break
}
wg.Wait()
close(ch)
}
// getCSVsFromCurrentDir returns a string slice
// with the filenames of csv files inside the
// current directory that are not "orders.csv"
func getCSVsFromCurrentDir() []string {
var filenames []string
err := filepath.Walk(".", func(path string, info os.FileInfo, err error) error {
if path != "." && strings.HasSuffix(path, ".csv") && path != ordersFilename {
filenames = append(filenames, path)
}
return nil
})
if err != nil {
log.Fatalln("Could not read file names in current dir")
}
return filenames
}
// getOrderIDs returns an Orders struct filled
// with order IDs retrieved from the file
func getOrderIDs(file *os.File) Orders {
var (
orders Orders
err error
fileContent string
)
reader := bufio.NewReader(file)
if fileContent, err = readLine(reader); err != nil {
log.Fatalln("Could not read file: " + ordersFilename)
}
for err == nil {
orders.ID = append(orders.ID, fileContent)
fileContent, err = readLine(reader)
}
return orders
}
func checkFile(filename string, orders Orders, ch chan<- map[string][]string) {
var (
err error
file *os.File
fileContent string
orderFilesMap map[string][]string
counter int
)
orderFilesMap = make(map[string][]string)
if file, err = os.Open(filename); err != nil {
log.Fatalln("Could not read file: " + filename)
}
reader := bufio.NewReader(file)
if fileContent, err = readLine(reader); err != nil {
log.Fatalln("Could not read file: " + filename)
}
for err == nil {
if containedInSlice(fileContent, orders.ID) && !containedInSlice(fileContent, orderFilesMap[filename]) {
orderFilesMap[filename] = append(orderFilesMap[filename], fileContent)
// fmt.Println("Found: ", fileContent, " in ", filename)
} else {
// fmt.Printf("Could not find: '%s' in '%s'\n", fileContent, filename)
}
counter++
fileContent, err = readLine(reader)
}
ch <- orderFilesMap
}
// containedInSlice returns true or false
// based on whether the string is contained
// in the slice
func containedInSlice(str string, slice []string) bool {
for _, ID := range slice {
if ID == str {
return true
}
}
return false
}
// readLine returns a line from the passed reader
func readLine(r *bufio.Reader) (string, error) {
var (
isPrefix bool = true
err error = nil
line, ln []byte
)
for isPrefix && err == nil {
line, isPrefix, err = r.ReadLine()
ln = append(ln, line...)
}
return string(ln), err
}
The first issue is the wg.Add always must be outside of the goroutine(s) it stands for. If it isn't, the
wg.Wait call might be called before the goutine(s) have actually started running (and called wg.Add) and therefore will "think"
that there is nothing to wait for.
The second issue with the code is that there are multiple ways it waits for the routines to be done. There is
the WaitGroup and there is the done channel. Use only one of them. Which one depends also on how the results of the
goroutines are used. Here we come to the next problem.
The third issue is with gathering the results. Currently the code only prints / uses a single result from the goroutines.
Put a for { ... } loop around the select and use return to break out of the loop if the done channel is closed.
(Note that you don't need to send anything on the done channel, closing it is enough.)
Improved Version 0.0.1
So here the first version (including some other "code cleanup") with a done channel used for closing and the WaitGroup removed:
func main() {
ordersFile, err := os.Open(ordersFilename)
if err != nil {
log.Fatalln("Could not open file: " + ordersFilename)
}
orders := getOrderIDs(ordersFile)
files := Files{
filenames: getCSVsFromCurrentDir(),
}
var (
mu = new(sync.Mutex)
filenamesSize = len(files.filenames)
ch = make(chan map[string][]string, filenamesSize)
done = make(chan bool)
)
for i, filename := range files.filenames {
go func(currentFilename string, ch chan<- map[string][]string, i int, orders Orders, filenamesSize *int, mu *sync.Mutex, done chan<- bool) {
checkFile(currentFilename, orders, ch)
mu.Lock()
*filenamesSize--
mu.Unlock()
// TODO: This also accesses filenamesSize, so it also needs to be protected with the mutex:
if i == *filenamesSize {
done <- true
close(done)
}
}(filename, ch, i, orders, &filenamesSize, mu, done)
}
// Note: closing a channel is not really needed, so you can omit this:
defer close(ch)
for {
select {
case str := <-ch:
fmt.Printf("%+v\n", str)
case <-done:
return
}
}
}
Improved Version 0.0.2
In your case we have some advantage however. We know exactly how many goroutines we started and therefore also how
many results we expect. (Of course if each goroutine returns a result which currently this code does.) That gives
us another option as we can collect the results with another for loop having the same amount of iterations:
func main() {
ordersFile, err := os.Open(ordersFilename)
if err != nil {
log.Fatalln("Could not open file: " + ordersFilename)
}
orders := getOrderIDs(ordersFile)
files := Files{
filenames: getCSVsFromCurrentDir(),
}
var (
// Note: a buffered channel helps speed things up. The size does not need to match the size of the items that will
// be passed through the channel. A fixed, small size is perfect here.
ch = make(chan map[string][]string, 5)
)
for _, filename := range files.filenames {
go func(filename string) {
// orders and channel are not variables of the loop and can be used without copying
checkFile(filename, orders, ch)
}(filename)
}
for range files.filenames {
str := <-ch
fmt.Printf("%+v\n", str)
}
}
A lot simpler, isn't it? Hope that helps!
There is a lot wrong with this code.
You're using the WaitGroup wrong. Add has to be called in the main goroutine, else there is a chance that Wait is called before all Add calls complete.
There's an extraneous Add(1) call right after initializing the WaitGroup that isn't matched by a Done() call, so Wait will never return (assuming the point above is fixed).
You're using both a WaitGroup and a done channel to signal completion. This is redundant at best.
You're reading filenamesSize while not holding the lock (in the if i == *filenamesSize statement). This is a race condition.
The i == *filenamesSize condition makes no sense in the first place. Goroutines execute in an arbitrary order, so you can't be sure that the goroutine with i == 0 is the last one to decrement filenamesSize
This can all be simplified by getting rid of most if the synchronization primitives and simply closing the ch channel when all goroutines are done:
func main() {
ch := make(chan map[string][]string)
var wg WaitGroup
for _, filename := range getCSVsFromCurrentDir() {
filename := filename // capture loop var
wg.Add(1)
go func() {
checkFile(filename, orders, ch)
wg.Done()
}()
}
go func() {
wg.Wait() // after all goroutines are done...
close(ch) // let range loop below exit
}()
for str := range ch {
// ...
}
}
not an answer, but some comments that does not fit the comment box.
In this part of the code
func main() {
var (
ordersFile *os.File
files Files
orders Orders
err error
)
mu := new(sync.Mutex)
wg := &sync.WaitGroup{}
wg.Add(1)
The last statement is a call to wg.Add that appears dangling. By that i mean we can hardly understand what will trigger the required wg.Done counter part. This is a mistake to call for wg.Add without a wg.Done, this is prone to errors to not write them in such way we can not immediately find them in pair.
In that part of the code, it is clearly wrong
go func(currentFilename string, ch chan<- map[string][]string, i int, orders Orders, wg *sync.WaitGroup, filenamesSize *int, mu *sync.Mutex, done chan<- bool) {
wg.Add(1)
defer wg.Done()
Consider that by the time the routine is executed, and that you added 1 to the waitgroup, the parent routine continues to execute. See this example: https://play.golang.org/p/N9Chaqkv4bd
The main routine does not wait for the waitgroup because it does not have time to increment.
There is more to say but i find it hard to understand the purpose of your code so i am not sure how to help you further without basically rewrite it.
I'm using Go routines to send queries to PostgreSQL master and slave nodes in parallel. The first host that returns a valid result wins. Error cases are outside the scope of this question.
The caller is the only one that cares about the contents of a *sql.Rows object, so intentionally my function doesn't do any operations on those. I use buffered channels to retrieve return objects from the Go routines, so there should be no Go routine leak. Garbage collection should take care of the rest.
There is a problem I haven't taught about properly: the Rows objects that remain behind in the channel are never closed. When I call this function from a (read only) transaction, tx.Rollback() returns an error for every instance of non-closed Rows object: "unexpected command tag SELECT".
This function is called from higher level objects:
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
rc := make(chan *sql.Rows, len(xs))
ec := make(chan error, len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
}
}(x)
}
var me MultiError
for i := 0; i < len(xs); i++ {
select {
case err := <-ec:
me.append(err)
case rows := <-rc: // Return on the first success
return rows, nil
}
}
return nil, me.check()
}
Executors can be *sql.DB, *sql.Tx or anything that complies with the interface:
type executor interface {
ExecContext(ctx context.Context, query string, args ...interface{}) (sql.Result, error)
QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error)
QueryRowContext(ctx context.Context, query string, args ...interface{}) *sql.Row
}
Rollback logic:
func (mtx MultiTx) Rollback() error {
ec := make(chan error, len(mtx))
for _, tx := range mtx {
go func(tx *Tx) {
err := tx.Rollback()
ec <- err
}(tx)
}
var me MultiError
for i := 0; i < len(mtx); i++ {
if err := <-ec; err != nil {
me.append(err)
}
}
return me.check()
}
MultiTx is a collection of open transactions on multiple nodes. It is a higher level object that calls multiQuery
What would be the best approach to "clean up" unused rows? Options I'm thinking about not doing:
Cancel the context: I believe it will work inconsistently, multiple queries might already have returned by the time cancel() is called
Create a deferred Go routine which continues to drain the channels and close the rows objects: If a DB node is slow to respond, Rollback() is still called before rows.Close()
Use a sync.WaitGroup somewhere in the MultiTx type, maybe in combination with (2): This can cause Rollback to hang if one of the nodes is unresponsive. Also, I wouldn't be sure how I would implement that.
Ignore the Rollback errors: Ignoring errors never sounds like a good idea, they are there for a reason.
What would be the recommended way of approaching this?
Edit:
As suggested by #Peter, I've tried canceling the context, but it seems this also invalidates all the returned Rows from the query. On rows.Scan I'm getting context canceled error at the higher level caller.
This is what I've done so far:
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
ctx, cancel := context.WithCancel(ctx)
defer cancel()
rc := make(chan *sql.Rows, len(xs))
ec := make(chan error, len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
cancel() // Cancel on success
}
}(x)
}
var (
me MultiError
rows *sql.Rows
)
for i := 0; i < len(xs); i++ {
select {
case err := <-ec:
me.append(err)
case r := <-rc:
if rows == nil { // Only use the first rows
rows = r
} else {
r.Close() // Cleanup remaining rows, if there are any
}
}
}
if rows != nil {
return rows, nil
}
return nil, me.check()
}
Edit 2:
#Adrian mentioned:
we can't see the code that's actually using any of this.
This code is reused by type methods. First there is the transaction type. The issues in this question are appearing on the Rollback() method above.
// MultiTx holds a slice of open transactions to multiple nodes.
// All methods on this type run their sql.Tx variant in one Go routine per Node.
type MultiTx []*Tx
// QueryContext runs sql.Tx.QueryContext on the tranactions in separate Go routines.
// The first non-error result is returned immediately
// and errors from the other Nodes will be ignored.
//
// If all nodes respond with the same error, that exact error is returned as-is.
// If there is a variety of errors, they will be embedded in a MultiError return.
//
// Implements boil.ContextExecutor.
func (mtx MultiTx) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
return multiQuery(ctx, mtx2Exec(mtx), query, args...)
}
Then there is:
// MultiNode holds a slice of Nodes.
// All methods on this type run their sql.DB variant in one Go routine per Node.
type MultiNode []*Node
// QueryContext runs sql.DB.QueryContext on the Nodes in separate Go routines.
// The first non-error result is returned immediately
// and errors from the other Nodes will be ignored.
//
// If all nodes respond with the same error, that exact error is returned as-is.
// If there is a variety of errors, they will be embedded in a MultiError return.
//
// Implements boil.ContextExecutor.
func (mn MultiNode) QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error) {
return multiQuery(ctx, nodes2Exec(mn), query, args...)
}
These methods the public wrappers around the multiQuery() function. Now I realize that just sending the *Rows into a buffered channel to die, is actually a memory leak. In the transaction cases it becomes clear, as Rollback() starts to complain. But in the non-transaction variant, the *Rows inside the channel will never be garbage collected, as the driver might hold reference to it until rows.Close() is called.
I've written this package to by used by an ORM, sqlboiler. My higher level logic passes a MultiTX object to the ORM. From that point, I don't have any explicit control over the returned Rows. A simplistic approach would be that my higher level code cancels the context before Rollback(), but I don't like that:
It gives a non-intuitive API. This (idiomatic) approach would break:
ctx, cancel = context.WithCancel(context.Background())
defer cancel()
tx, _ := db.BeginTx(ctx)
defer tx.Rollback()
The ORM's interfaces also specify the regular, non-context aware Query() variants, which in my package's case will run against context.Background().
I'm starting to worry that this broken by design... Anyway, I will start by implementing a Go routine that will drain the channel and close the *Rows. After that I will see if I can implement some reasonable waiting / cancellation mechanism that won't affect the returned *Rows
I think that the function below will do what you require with the one provisio being that the context passed in should be cancelled when you are done with the results (otherwise one context.WithCancel will leak; I cannot see a way around that as cancelling it within the function will invalidate the returned sql.Rows).
Note that I have not had time to test this (would need to setup a database, implement your interfaces etc) so there may well be a bug hidden in the code (but I believe the basic algorithm is sound)
// queryResult holds the goroutine# and the result from that gorouting (need both so we can avoid cancelling the relevant context)
type queryResult struct {
no int
rows *sql.Rows
}
// multiQuery - Executes multiple queries and returns either the first to resutn a result or, if all fail, a multierror summarising the errors
// Important: This should be used for READ ONLY queries only (it is possible that more than one will complete)
// Note: The ctx passed in must be cancelled to avoid leaking a context (this routine cannot cancel the context used for the winning query)
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, error) {
noOfQueries := len(xs)
rc := make(chan queryResult) // Channel for results; unbuffered because we only want one, and only one, result
ec := make(chan error) // errors get sent here - goroutines must send a result or 1 error
defer close(ec) // Ensure the error consolidation go routine will complete
// We need a way to cancel individual goroutines as we do not know which one will succeed
cancelFns := make([]context.CancelFunc, noOfQueries)
// All goroutines must terminate before we exit (otherwise the transaction maybe rolled back before they are cancelled leading to "unexpected command tag SELECT")
var wg sync.WaitGroup
wg.Add(noOfQueries)
for i, x := range xs {
var queryCtx context.Context
queryCtx, cancelFns[i] = context.WithCancel(ctx)
go func(ctx context.Context, queryNo int, x executor) {
defer wg.Done()
rows, err := x.QueryContext(ctx, query, args...)
if err != nil {
ec <- err // Error collection go routine guaranteed to run until all query goroutines complete
return
}
select {
case rc <- queryResult{queryNo, rows}:
return
case <-ctx.Done(): // If another query has already transmitted its results these should be thrown away
rows.Close() // not strictly required because closed context should tidy up
return
}
}(queryCtx, i, x)
}
// Start go routine that will send a MultiError to a channel if all queries fail
mec := make(chan MultiError)
go func() {
var me MultiError
errCount := 0
for err := range ec {
me.append(err)
errCount += 1
if errCount == noOfQueries {
mec <- me
return
}
}
}()
// Wait for one query to succeed or all queries to fail
select {
case me := <-mec:
for _, cancelFn := range cancelFns { // not strictly required so long as ctx is eventually cancelled
cancelFn()
}
wg.Wait()
return nil, me.check()
case result := <-rc:
for i, cancelFn := range cancelFns { // not strictly required so long as ctx is eventually cancelled
if i != result.no { // do not cancel the query that returned a result
cancelFn()
}
}
wg.Wait()
return result.rows, nil
}
}
Thanks to the comments from #Peter and the answer of #Brits, I got fresh ideas on how to approach this.
Blue print
3 out of 4 proposals from the question were needed to be implemented.
1. Cancel the Context
mtx.QueryContext() creates a descendant context and sets the CancelFunc in the MultiTx object.
The cancelWait() helper cancels an old context and waits for MultiTX.Done if its not nil. It is called on Rollback() and before every new query.
2. Drain the channel
In multiQuery(), Upon obtaining the first successful Rows, a Go routine is launched to drain and close the remaining Rows. The rows channel no longer needs to be buffered.
An additional Go routine and a WaitGroup is used to close the error and rows channels.
3. Return a done channel
Instead of the proposed WaitGroup, multiQuery() returns a done channel. The channel is closed once the drain & close routine has finished. mtx.QueryContext() sets done the channel on the MultiTx object.
Errors
Instead of the select block, only drain the error channel if there are now Rows. The error needs to remain buffered for this reason.
Code
// MultiTx holds a slice of open transactions to multiple nodes.
// All methods on this type run their sql.Tx variant in one Go routine per Node.
type MultiTx struct {
tx []*Tx
done chan struct{}
cancels context.CancelFunc
}
func (m *MultiTx) cancelWait() {
if m.cancel != nil {
m.cancel()
}
if m.done != nil {
<-m.done
}
// reset
m.done, m.cancel = nil, nil
}
// Context creates a child context and appends CancelFunc in MultiTx
func (m *MultiTx) context(ctx context.Context) context.Context {
m.cancelWait()
ctx, m.cancel = context.WithCancel(ctx)
return ctx
}
// QueryContext runs sql.Tx.QueryContext on the tranactions in separate Go routines.
func (m *MultiTx) QueryContext(ctx context.Context, query string, args ...interface{}) (rows *sql.Rows, err error) {
rows, m.done, err = multiQuery(m.context(ctx), mtx2Exec(m.tx), query, args...)
return rows, err
}
func (m *MultiTx) Rollback() error {
m.cancelWait()
ec := make(chan error, len(m.tx))
for _, tx := range m.tx {
go func(tx *Tx) {
err := tx.Rollback()
ec <- err
}(tx)
}
var me MultiError
for i := 0; i < len(m.tx); i++ {
if err := <-ec; err != nil {
me.append(err)
}
}
return me.check()
}
func multiQuery(ctx context.Context, xs []executor, query string, args ...interface{}) (*sql.Rows, chan struct{}, error) {
rc := make(chan *sql.Rows)
ec := make(chan error, len(xs))
var wg sync.WaitGroup
wg.Add(len(xs))
for _, x := range xs {
go func(x executor) {
rows, err := x.QueryContext(ctx, query, args...)
switch { // Make sure only one of them is returned
case err != nil:
ec <- err
case rows != nil:
rc <- rows
}
wg.Done()
}(x)
}
// Close channels when all query routines completed
go func() {
wg.Wait()
close(ec)
close(rc)
}()
rows, ok := <-rc
if ok { // ok will be false if channel closed before any rows
done := make(chan struct{}) // Done signals the caller that all remaining rows are properly closed
go func() {
for rows := range rc { // Drain channel and close unused Rows
rows.Close()
}
close(done)
}()
return rows, done, nil
}
// no rows, build error return
var me MultiError
for err := range ec {
me.append(err)
}
return nil, nil, me.check()
}
Edit: Cancel & wait for old contexts before every Query, as *sql.Tx is not Go routine save, all previous queries have to be done before a next call.
I'm using goroutines in my project and I want to to assign the values to the struct fields but I don't know that how I will assign the values get by using mongodb quires to the struct fields I'm showing my struct and the query too.
type AppLoadNew struct{
StripeTestKey string `json:"stripe_test_key" bson:"stripe_test_key,omitempty"`
Locations []Locations `json:"location" bson:"location,omitempty"`
}
type Locations struct{
Id int `json:"_id" bson:"_id"`
Location string `json:"location" bson:"location"`
}
func GoRoutine(){
values := AppLoadNew{}
go func() {
data, err := GetStripeTestKey(bson.M{"is_default": true})
if err == nil {
values.StripeTestKey := data.TestStripePublishKey
}
}()
go func() {
location, err := GetFormLocation(bson.M{"is_default": true})
if err == nil {
values.Locations := location
}
}()
fmt.Println(values) // Here it will nothing
// empty
}
Can you please help me that I will assign all the values to the AppLoadNew struct.
In Go no value is safe for concurrent read and write (from multiple goroutines). You must synchronize access.
Reading and writing variables from multiple goroutines can be protected using sync.Mutex or sync.RWMutex, but in your case there is something else involved: you should wait for the 2 launched goroutines to complete. For that, the go-to solution is sync.WaitGroup.
And since the 2 goroutines write 2 different fields of a struct (which act as 2 distinct variables), they don't have to be synchronized to each other (see more on this here: Can I concurrently write different slice elements). Which means using a sync.WaitGroup is sufficient.
This is how you can make it safe and correct:
func GoRoutine() {
values := AppLoadNew{}
wg := &sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
data, err := GetStripeTestKey(bson.M{"is_default": true})
if err == nil {
values.StripeTestKey = data.StripeTestKey
}
}()
wg.Add(1)
go func() {
defer wg.Done()
location, err := GetFormLocation(bson.M{"is_default": true})
if err == nil {
values.Locations = location
}
}()
wg.Wait()
fmt.Println(values)
}
See a (slightly modified) working example on the Go Playground.
See related / similar questions:
Reading values from a different thread
golang struct concurrent read and write without Lock is also running ok?
How to make a variable thread-safe
You can use sync package with WaitGroup, here is an example:
package main
import (
"fmt"
"sync"
"time"
)
type Foo struct {
One string
Two string
}
func main() {
f := Foo{}
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
// Perform long calculations
<-time.After(time.Second * 1)
f.One = "foo"
}()
wg.Add(1)
go func() {
defer wg.Done()
// Perform long calculations
<-time.After(time.Second * 2)
f.Two = "bar"
}()
fmt.Printf("Before %+v\n", f)
wg.Wait()
fmt.Printf("After %+v\n", f)
}
The output:
Before {One: Two:}
After {One:foo Two:bar}
The following is the code that is giving me problem. What i want to achieve is to create those many tables in parallel. After all the tables are created I want to exit the functions.
func someFunction(){
....
gos := 5
proc := make(chan bool, gos)
allDone := make(chan bool)
for i:=0; i<gos; i++ {
go func() {
for j:=i; j<len(tables); j+=gos {
r, err := db.Exec(tables[j])
fmt.Println(r)
if err != nil {
methods.CheckErr(err, err.Error())
}
}
proc <- true
}()
}
go func() {
for i:=0; i<gos; i++{
<-proc
}
allDone <- true
}()
for {
select {
case <-allDone:
return
}
}
}
I'm creating two channels 1 to keep track of number of tables created (proc) and other (allDone) to see if all are done.
When i run this code then the go routine to create table starts execution but before it completes someFunction gets terminated.
However there is no problem if run the code sequentially
What is the mistake in my design pattern and also how do i correct it.
The usual pattern for what you're trying to achieve uses WaitGroup.
I think the problem you're facing is that i is captured by each goroutine and it keeps getting incremented by the outer loop. Your inner loop starts at i and since the outer loop has continued, each goroutine starts at 5.
Try passing the iterator as parameter to the goroutine so that you get a new copy each time.
func someFunction(){
....
gos := 5
var wg sync.WaitGroup
wg.Add(gos)
for i:=0; i< gos; i++ {
go func(n int) {
defer wg.Done()
for j:=n; j<len(tables); j+=gos {
r, err := db.Exec(tables[j])
fmt.Println(r)
if err != nil {
methods.CheckErr(err, err.Error())
}
}
}(i)
}
wg.Wait();
}
I'm not sure what you're trying to achieve here, each goroutine does db.Exec on all the tables above the one it started with so the first one treats all the tables, the second one treats all but the first one and so on. Is this what you intended?
I'm trying to understand the difference in Go between creating an anonymous function which takes a parameter, versus having that function act as a closure. Here is an example of the difference.
With parameter:
func main() {
done := make(chan bool, 1)
go func(c chan bool) {
time.Sleep(50 * time.Millisecond)
c <- true
}(done)
<-done
}
As closure:
func main() {
done := make(chan bool, 1)
go func() {
time.Sleep(50 * time.Millisecond)
done <- true
}()
<-done
}
My question is, when is the first form better than the second? Would you ever use a parameter for this kind of thing? The only time I can see the first form being useful is when returning a func(x, y) from another function.
The difference between using a closure vs using a function parameter has to do with sharing the same variable vs getting a copy of the value. Consider these two examples below.
In the Closure all function calls will use the value stored in i. This value will most likely already reach 3 before any of the goroutines has had time to print it's value.
In the Parameter example each function call will get passed a copy of the value of i when the call was made, thus giving us the result we more likely wanted:
Closure:
for i := 0; i < 3; i++ {
go func() {
fmt.Println(i)
}()
}
Result:
3
3
3
Parameter:
for i := 0; i < 3; i++ {
go func(v int) {
fmt.Println(v)
}(i)
}
Result:
0
1
2
Playground: http://play.golang.org/p/T5rHrIKrQv
When to use parameters
Definitely the first form is preferred if you plan to change the value of the variable which you don't want to observe in the function.
This is the typical case when the anonymous function is inside a for loop and you intend to use the loop's variables, for example:
for i := 0; i < 10; i++ {
go func(i int) {
fmt.Println(i)
}(i)
}
Without passing the variable i you might observe printing 10 ten times. With passing i, you will observe numbers printed from 0 to 9.
When not to use parameters
If you don't want to change the value of the variable, it is cheaper not to pass it and thus not create another copy of it. This is especially true for large structs. Although if you later alter the code and modify the variable, you may easily forget to check its effect on the closure and get unexpected results.
Also there might be cases when you do want to observe changes made to "outer" variables, such as:
func GetRes(name string) (Res, error) {
res, err := somepack.OpenRes(name)
if err != nil {
return nil, err
}
closeres := true
defer func() {
if closeres {
res.Close()
}
}()
// Do other stuff
if err = otherStuff(); err != nil {
return nil, err // res will be closed
}
// Everything went well, return res, but
// res must not be closed, it will be the responsibility of the caller
closeres = false
return res, nil // res will not be closed
}
In this case the GetRes() is to open some resource. But before returning it other things have to be done which might also fail. If those fail, res must be closed and not returned. If everything goes well, res must not be closed and returned.
This is a example of parameter from net/Listen
package main
import (
"io"
"log"
"net"
)
func main() {
// Listen on TCP port 2000 on all available unicast and
// anycast IP addresses of the local system.
l, err := net.Listen("tcp", ":2000")
if err != nil {
log.Fatal(err)
}
defer l.Close()
for {
// Wait for a connection.
conn, err := l.Accept()
if err != nil {
log.Fatal(err)
}
// Handle the connection in a new goroutine.
// The loop then returns to accepting, so that
// multiple connections may be served concurrently.
go func(c net.Conn) {
// Echo all incoming data.
io.Copy(c, c)
// Shut down the connection.
c.Close()
}(conn)
}
}