Data passed by reference to "defer" - go

There is a passage in the "Mastering Concurrency in Go" book which made me think I might be missing something about "defer" functions.
You should also take note that any data passed by reference may be in an unexpected state.
func main() {
aValue := new(int)
defer fmt.Println(*aValue)
for i := 0; i < 100; i++ {
*aValue++
}
}
This prints 0, I thought, because according to spec:
Each time a "defer" statement executes, the function value and parameters to the call are evaluated as usual and saved anew
That is, *aValue is 0 when defer is called and that's why at the end it prints 0. Whether or not a pointer is passed to the differ function in this case is irrelevant.
Is my understanding correct or am I missing something?

Consider a situation using structs.
type User struct {
Name string
}
func main() {
user := User{}
defer fmt.Printf("%#v\n", user)
user.Name = "AJ"
}
You know defer should run at the end, so you might expect to see User{Name: "AJ"} but instead you get User{Name: ""} because defer binds parameters.
If you use a pointer it works.
user := &User{}
If you use a closure, it works.
defer func() {
fmt.Printf("%#v\n", user)
}()

The defer statement is "evaluating" the parameters and saving the result, and the result of evaluating *aValue is 0 at the time of the defer call. Something like this may be what you're looking for:
func main() {
aValue := new(int)
defer func() { fmt.Println(*aValue) }()
for i := 0; i < 100; i++ {
*aValue++
}
}

Related

Check named return error using defer function

Hi I want to write a generic function to trace error message when a function returns error. So I wrote this:
func TraceError1(err *error) {
if err != nil && *err != nil {
pc := make([]uintptr, 15)
n := runtime.Callers(2, pc)
frames := runtime.CallersFrames(pc[:n])
frame, _ := frames.Next()
fmt.Printf("%s:%d %s\n", frame.File, frame.Line, frame.Function)
}
}
func TraceError2(err error) {
if err != nil {
pc := make([]uintptr, 15)
n := runtime.Callers(2, pc)
frames := runtime.CallersFrames(pc[:n])
frame, _ := frames.Next()
fmt.Printf("%s:%d %s\n", frame.File, frame.Line, frame.Function)
}
}
func foo() (err error) {
defer TraceError1(&err)
defer TraceError2(err)
fmt.Println("do something")
return fmt.Errorf("haha")
}
TraceError1 works but TraceError2 didn't. In my understanding, error is an interface so it is a pointer/address, why do I need to pass its address? Why TraceError2 cannot work? Thanks.
In case of TraceError1 you are passing a pointer to the named return value err. The pointer is non-nil, but the value it points at (err) is nil (at the time of defer). However, it is not yet evaluated (dereferenced) because TraceError1 has not yet been called. By the time the function does run (after foo returns) and the pointer gets dereferenced, the value of err has been updated (by the return statement inside foo).
However, in case of TraceError2, a nil interface value is passed, which will stay nil even when TraceError2 executes eventually.
Here is a simpler example:
package main
import "fmt"
func intByValue(i int) {
fmt.Printf("i = %d\n", i)
// ^--- `i` is an integer value
// --- whatever i was passed to the function, gets printed
}
func intByRef(i *int) {
var v int = *i // i is a pointer to an int, which gets dereferenced here
// the *address* where the actual value resides was passed
// while the address stays the same, its value can change before
// i is dereferenced, and its value stored in v.
fmt.Printf("i = %d\n", v)
}
func main() {
var i int
defer intByValue(i) // passed the *value* of i, which is 0 right now
defer intByRef(&i) // passed a *pointer* to i, which contains 0 right now
i = 100 // before intByRef could "dereference" its argument, the value that it
// contained has been updated
// intByRef gets called, dereferences the value, finds 100, prints it.
// intByValue gets called, finds 0, prints it
// result should be:
// i = 100
// i = 0
}
So unfortunately, if you want the ability to update the error (e.g. by returning a named return value) before it gets used by the deferred function, you are going to have to pass around pointers to the variable.
In other words, TraceError2 is simply not suited for your use case.
Edit: use correct terminology and (questionably) improve example code.
As go blog explained
The behavior of defer statements is straightforward and predictable.
There are three simple rules:
A deferred function's arguments are evaluated when the defer statement is evaluated.
Deferred function calls are executed in Last In First Out order after the surrounding function returns.
Deferred functions may read and assign to the returning function's named return values.
According to first point, when you call defer TraceError2(err) , that err = nil and that is the value pass to the TraceError2 function.
TraceError1(err *error) works because it is getting a pointer to err, and that pointer value is assigned before defer func TraceError1 is executed.
Simple example code to explain the behaviour.
package main
import (
"fmt"
"runtime"
)
func main() {
i := 0
defer func(i int) {
fmt.Printf("%d\n",i) //Output: 0
}(i)
defer func(i *int) {
defer fmt.Printf("%d\n",*i) //Output: 1
}(&i)
i++
}

How can I use goroutine to execute for loop?

Suppose I have a slice like:
stu = [{"id":"001","name":"A"} {"id":"002", "name":"B"}] and maybe more elements like this. inside of the slice is a long string, I want to use json.unmarshal to parse it.
type Student struct {
Id string `json:"id"`
Name string `json:"name"`
}
studentList := make([]Student,len(stu))
for i, st := range stu {
go func(st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(st)
}
//and a function like this
func getList(stu string)(res Student, error){
var student Student
err := json.Unmarshal(([]byte)(stu), &student)
if err != nil {
return
}
return &student,nil
}
I got the nil result, so I would say the goroutine is out-of-order to execute, so I don't know if it can use studentList[i] to get value.
Here are a few potential issues with your code:
Value of i is probably not what you expect
for i, st := range stu {
go func(st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(st)
}
You kick off a number of goroutines and, within them, reference i. The issue is that i is likely to have changed between the time you started the goroutine and the time the goroutine references it (the for loop runs concurrently to the goroutines it starts). It is quite possible that the for completes before any of the goroutines do meaning that all output will be stored in the last element of studentList (they will overwrite each other so you will end up with one value).
A simple solution is to pass i into the goroutine function (e.g. go func(st string, i int){}(st, i) (this creates a copy). See this for more info.
Output of studentList
You don't say in the question but I suspect you are running fmt.Println(studentList[1] (or similar) immediately after the for loop completes. As mentioned above it's quite possible that none of the goroutines have completed at that point (or they may of, you don't know). Using a WaitGroup is a fairly easy way around this:
var wg sync.WaitGroup
wg.Add(len(stu))
for i, st := range stu {
go func(st string, i int) {
var err error
studentList[i], err = getList(st)
if err != nil {
panic(err)
}
wg.Done()
}(st, i)
}
wg.Wait()
I have corrected these issues in the playground.
not because of this
the goroutine is out-of-order to execute
There are at least two issues here:
you should not use the for loop variable i in goroutine.
multiple goroutines read i, for loop modify i, it's race condition here. to make i works as expected, change the code to:
for i, st := range stu {
go func(i int, st string){
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(i, st)
}
what's more, use sync.WaitGroup to wait for all goroutine.
var wg sync.WaitGroup
for i, st := range stu {
wg.Add(1)
go func(i int, st string){
defer wg.Done()
studentList[i], err = getList(st)
if err != nil {
return ... //just example
}
}(i, st)
}
wg.Wait()
P.S.: (WARNING: maybe not always true)
this line studentList[i], err = getList(st) ,
although it may not cause data race, but it's somehow not friendly to cpu cache line. better avoid writing code like this.

How to assign the values to struct while go routines are running?

I'm using goroutines in my project and I want to to assign the values to the struct fields but I don't know that how I will assign the values get by using mongodb quires to the struct fields I'm showing my struct and the query too.
type AppLoadNew struct{
StripeTestKey string `json:"stripe_test_key" bson:"stripe_test_key,omitempty"`
Locations []Locations `json:"location" bson:"location,omitempty"`
}
type Locations struct{
Id int `json:"_id" bson:"_id"`
Location string `json:"location" bson:"location"`
}
func GoRoutine(){
values := AppLoadNew{}
go func() {
data, err := GetStripeTestKey(bson.M{"is_default": true})
if err == nil {
values.StripeTestKey := data.TestStripePublishKey
}
}()
go func() {
location, err := GetFormLocation(bson.M{"is_default": true})
if err == nil {
values.Locations := location
}
}()
fmt.Println(values) // Here it will nothing
// empty
}
Can you please help me that I will assign all the values to the AppLoadNew struct.
In Go no value is safe for concurrent read and write (from multiple goroutines). You must synchronize access.
Reading and writing variables from multiple goroutines can be protected using sync.Mutex or sync.RWMutex, but in your case there is something else involved: you should wait for the 2 launched goroutines to complete. For that, the go-to solution is sync.WaitGroup.
And since the 2 goroutines write 2 different fields of a struct (which act as 2 distinct variables), they don't have to be synchronized to each other (see more on this here: Can I concurrently write different slice elements). Which means using a sync.WaitGroup is sufficient.
This is how you can make it safe and correct:
func GoRoutine() {
values := AppLoadNew{}
wg := &sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
data, err := GetStripeTestKey(bson.M{"is_default": true})
if err == nil {
values.StripeTestKey = data.StripeTestKey
}
}()
wg.Add(1)
go func() {
defer wg.Done()
location, err := GetFormLocation(bson.M{"is_default": true})
if err == nil {
values.Locations = location
}
}()
wg.Wait()
fmt.Println(values)
}
See a (slightly modified) working example on the Go Playground.
See related / similar questions:
Reading values from a different thread
golang struct concurrent read and write without Lock is also running ok?
How to make a variable thread-safe
You can use sync package with WaitGroup, here is an example:
package main
import (
"fmt"
"sync"
"time"
)
type Foo struct {
One string
Two string
}
func main() {
f := Foo{}
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
// Perform long calculations
<-time.After(time.Second * 1)
f.One = "foo"
}()
wg.Add(1)
go func() {
defer wg.Done()
// Perform long calculations
<-time.After(time.Second * 2)
f.Two = "bar"
}()
fmt.Printf("Before %+v\n", f)
wg.Wait()
fmt.Printf("After %+v\n", f)
}
The output:
Before {One: Two:}
After {One:foo Two:bar}

Passing parameters to function closure

I'm trying to understand the difference in Go between creating an anonymous function which takes a parameter, versus having that function act as a closure. Here is an example of the difference.
With parameter:
func main() {
done := make(chan bool, 1)
go func(c chan bool) {
time.Sleep(50 * time.Millisecond)
c <- true
}(done)
<-done
}
As closure:
func main() {
done := make(chan bool, 1)
go func() {
time.Sleep(50 * time.Millisecond)
done <- true
}()
<-done
}
My question is, when is the first form better than the second? Would you ever use a parameter for this kind of thing? The only time I can see the first form being useful is when returning a func(x, y) from another function.
The difference between using a closure vs using a function parameter has to do with sharing the same variable vs getting a copy of the value. Consider these two examples below.
In the Closure all function calls will use the value stored in i. This value will most likely already reach 3 before any of the goroutines has had time to print it's value.
In the Parameter example each function call will get passed a copy of the value of i when the call was made, thus giving us the result we more likely wanted:
Closure:
for i := 0; i < 3; i++ {
go func() {
fmt.Println(i)
}()
}
Result:
3
3
3
Parameter:
for i := 0; i < 3; i++ {
go func(v int) {
fmt.Println(v)
}(i)
}
Result:
0
1
2
Playground: http://play.golang.org/p/T5rHrIKrQv
When to use parameters
Definitely the first form is preferred if you plan to change the value of the variable which you don't want to observe in the function.
This is the typical case when the anonymous function is inside a for loop and you intend to use the loop's variables, for example:
for i := 0; i < 10; i++ {
go func(i int) {
fmt.Println(i)
}(i)
}
Without passing the variable i you might observe printing 10 ten times. With passing i, you will observe numbers printed from 0 to 9.
When not to use parameters
If you don't want to change the value of the variable, it is cheaper not to pass it and thus not create another copy of it. This is especially true for large structs. Although if you later alter the code and modify the variable, you may easily forget to check its effect on the closure and get unexpected results.
Also there might be cases when you do want to observe changes made to "outer" variables, such as:
func GetRes(name string) (Res, error) {
res, err := somepack.OpenRes(name)
if err != nil {
return nil, err
}
closeres := true
defer func() {
if closeres {
res.Close()
}
}()
// Do other stuff
if err = otherStuff(); err != nil {
return nil, err // res will be closed
}
// Everything went well, return res, but
// res must not be closed, it will be the responsibility of the caller
closeres = false
return res, nil // res will not be closed
}
In this case the GetRes() is to open some resource. But before returning it other things have to be done which might also fail. If those fail, res must be closed and not returned. If everything goes well, res must not be closed and returned.
This is a example of parameter from net/Listen
package main
import (
"io"
"log"
"net"
)
func main() {
// Listen on TCP port 2000 on all available unicast and
// anycast IP addresses of the local system.
l, err := net.Listen("tcp", ":2000")
if err != nil {
log.Fatal(err)
}
defer l.Close()
for {
// Wait for a connection.
conn, err := l.Accept()
if err != nil {
log.Fatal(err)
}
// Handle the connection in a new goroutine.
// The loop then returns to accepting, so that
// multiple connections may be served concurrently.
go func(c net.Conn) {
// Echo all incoming data.
io.Copy(c, c)
// Shut down the connection.
c.Close()
}(conn)
}
}

What happens when defer is called twice on same variable?

What happened when defer called twice when the struct of that method has been changed?
For example:
rows := Query(`SELECT FROM whatever`)
defer rows.Close()
for rows.Next() {
// do something
}
rows = Query(`SELECT FROM another`)
defer rows.Close()
for rows.Next() {
// do something else
}
which rows when the last rows.Close() called?
It depends on the method receiver and on the type of the variable.
Short answer: if you're using the database/sql package, your deferred Rows.Close() methods will properly close both of your Rows instances because Rows.Close() has pointer receiver and because DB.Query() returns a pointer (and so rows is a pointer). See reasoning and explanation below.
To avoid confusion, I recommend using different variables and it will be clear what you want and what will be closed:
rows := Query(`SELECT FROM whatever`)
defer rows.Close()
// ...
rows2 := Query(`SELECT FROM whatever`)
defer rows2.Close()
I'd like to point out an important fact that comes from the deferred function and its parameters being evaluated immedately which is stated in the Effective Go blog post and in the Language Spec: Deferred statements too:
Each time a "defer" statement executes, the function value and parameters to the call are evaluated as usual and saved anew but the actual function is not invoked. Instead, deferred functions are invoked immediately before the surrounding function returns, in the reverse order they were deferred.
If variable is not a pointer: You will observe different results when calling a method deferred, depending if the method has a pointer receiver.
If variable is a pointer, you will see always the "desired" result.
See this example:
type X struct {
S string
}
func (x X) Close() {
fmt.Println("Value-Closing", x.S)
}
func (x *X) CloseP() {
fmt.Println("Pointer-Closing", x.S)
}
func main() {
x := X{"Value-X First"}
defer x.Close()
x = X{"Value-X Second"}
defer x.Close()
x2 := X{"Value-X2 First"}
defer x2.CloseP()
x2 = X{"Value-X2 Second"}
defer x2.CloseP()
xp := &X{"Pointer-X First"}
defer xp.Close()
xp = &X{"Pointer-X Second"}
defer xp.Close()
xp2 := &X{"Pointer-X2 First"}
defer xp2.CloseP()
xp2 = &X{"Pointer-X2 Second"}
defer xp2.CloseP()
}
Output:
Pointer-Closing Pointer-X2 Second
Pointer-Closing Pointer-X2 First
Value-Closing Pointer-X Second
Value-Closing Pointer-X First
Pointer-Closing Value-X2 Second
Pointer-Closing Value-X2 Second
Value-Closing Value-X Second
Value-Closing Value-X First
Try it on the Go Playground.
Using a pointer variable the result is always good (as expected).
Using a non-pointer variable and using pointer receiver we see the same printed results (the latest) but if we have value receiver, it prints 2 different results.
Explanation for non-pointer variable:
As stated, deferred function including the receiver is evaluated when the defer executes. In case of a pointer receiver it will be the address of the local variable. So when you assign a new value to it and call another defer, the pointer receiver will be again the same address of the local variable (just the pointed value is different). So later when the function is executed, both will use the same address twice but the pointed value will be the same, the one assigned later.
In case of value receiver, the receiver is a copy which is made when the defer executed, so if you assign a new value to the variable and call another defer, another copy will be made which is different from the previous one.
Effective Go mentions:
The arguments to the deferred function (which include the receiver if the function is a method) are evaluated when the defer executes, not when the call executes.
Besides avoiding worries about variables changing values as the function executes, this means that a single deferred call site can defer multiple function executions
In your case, the defer would reference the second rows instance.
The two deferred functions are executed in LIFO order (as mentioned also in "Defer, Panic, and Recover").
As icza mentions in his answer and in the comments:
The 2 deferred Close() methods will refer to the 2 distinct Rows values and both will be properly closed because rows is a pointer, not a value type.
Ah I see, the rows always refer to the last one, http://play.golang.org/p/_xzxHnbFSz
package main
import "fmt"
type X struct {
A string
}
func (x *X) Close() {
fmt.Println(x.A)
}
func main() {
rows := X{`1`}
defer rows.Close()
rows = X{`2`}
defer rows.Close()
}
Output:
2
2
So maybe the best way to preserve the object is to pass it to a function: http://play.golang.org/p/TIMCliUn60
package main
import "fmt"
type X struct {
A string
}
func (x *X) Close() {
fmt.Println(x.A)
}
func main() {
rows := X{`1`}
defer func(r X) { r.Close() }(rows)
rows = X{`2`}
defer func(r X) { r.Close() }(rows)
}
Output:
2
1
Most of the time, you should be able to just add a block, that way you don't have to worry about thinking of a new variable name, and you don't have to worry about any of the items not being closed:
rows := Query(`SELECT FROM whatever`)
defer rows.Close()
for rows.Next() {
// do something
}
{
rows := Query(`SELECT FROM another`)
defer rows.Close()
for rows.Next() {
// do something else
}
}
https://golang.org/ref/spec#Blocks

Resources