golang race condition - marshaling to XML in 2 goroutines - go

i am getting data race condition when i try to marshal Struct to XML in a GoRoutine of 2 or more.
Sample main program : http://play.golang.org/p/YhkWXWL8C0
i believe xml:"members>member" causing this . if i change it to normal then all works fine. any thoughts why go-1.4.x version doing that.
Family struct {
XMLName xml.Name `xml:"family"`
Name string `xml:"famil_name"`
Members []Person `xml:"members>member"`
//Members []Person `xml:"members"`
}
go run -race data_race.go giving me
2015/02/06 13:53:43 Total GoRoutine Channels Created 2
2015/02/06 13:53:43 <family><famil_name></famil_name><members><person><name>ABCD</name><age>0</age></person><person><name>dummy</name><age>0</age></person></members></family>
==================
WARNING: DATA RACE
Write by goroutine 6:
runtime.slicecopy()
/usr/local/go/src/runtime/slice.go:94 +0x0
encoding/xml.(*parentStack).push()
/usr/local/go/src/encoding/xml/marshal.go:908 +0x2fb
encoding/xml.(*printer).marshalStruct()
/usr/local/go/src/encoding/xml/marshal.go:826 +0x628
encoding/xml.(*printer).marshalValue()
/usr/local/go/src/encoding/xml/marshal.go:531 +0x1499
encoding/xml.(*Encoder).Encode()
/usr/local/go/src/encoding/xml/marshal.go:153 +0xb8
encoding/xml.Marshal()
/usr/local/go/src/encoding/xml/marshal.go:72 +0xfb
main.ToXml()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:51 +0x227
main.func·001()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:61 +0x74
Previous read by goroutine 5:
encoding/xml.(*parentStack).trim()
/usr/local/go/src/encoding/xml/marshal.go:893 +0x2ae
encoding/xml.(*printer).marshalStruct()
/usr/local/go/src/encoding/xml/marshal.go:836 +0x203
encoding/xml.(*printer).marshalValue()
/usr/local/go/src/encoding/xml/marshal.go:531 +0x1499
encoding/xml.(*Encoder).Encode()
/usr/local/go/src/encoding/xml/marshal.go:153 +0xb8
encoding/xml.Marshal()
/usr/local/go/src/encoding/xml/marshal.go:72 +0xfb
main.ToXml()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:51 +0x227
main.func·001()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:61 +0x74
Goroutine 6 (running) created at:
main.AsyncExecute()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:67 +0x15d
main.main()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:80 +0x2bf
Goroutine 5 (finished) created at:
main.AsyncExecute()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:67 +0x15d
main.main()
/Users/kadalamittai/selfie/go/src/github.com/ivam/goal/command/data_race.go:80 +0x2bf
==================

This looks like a bug in the Go 1.41 library. I've reported it as a bug. Hopefully it should get fixed. I'll leave the analysis below for reference.
What's happening is that there's an implicit shared value due to the use of getTypeInfo() which returns a type description of the struct. For efficiency, it appears to be globally cached state. Other parts of the XML encoder take components of this state and pass it around. It appears that there's an inadvertent mutation happening due to a slice append on a component of the shared value.
The p.stack attribute that's reporting as the source of the data race originates from a part of the typeInfo shared value, where a slice of tinfo.parents gets injected on line 821. That's ultimately where the sharing is happening with the potential for read and writing, because later on there are appends happening on the slice, and that can do mutation on the underlying array.
What should probably happen instead is that the slice should be capacity-restricted so that any potential append won't do a write on the shared array value.
That is, line 897 of the encoder library could probably be be changed from:
897 s.stack = parents[:split]
to:
897 s.stack = parents[:split:split]
to correct the issue.

Related

Why doesn't panic show all running goroutines?

Page 253 of The Go Programming Language states:
... if instead of returning from main in the event of cancellation, we execute a call to panic, then the runtime will dump the stack of every goroutine in the program.
This code deliberately leaks a goroutine by waiting on a channel that never has anything to receive:
package main
import (
"fmt"
"time"
)
func main() {
never := make(chan struct{})
go func() {
defer fmt.Println("End of child")
<-never
}()
time.Sleep(10 * time.Second)
panic("End of main")
}
However, the runtime only lists the main goroutine when panic is called:
panic: End of main
goroutine 1 [running]:
main.main()
/home/simon/panic/main.go:15 +0x7f
exit status 2
If I press Ctrl-\ to send SIGQUIT during the ten seconds before main panics, I do see the child goroutine listed in the output:
goroutine 1 [sleep]:
time.Sleep(0x2540be400)
/usr/lib/go-1.17/src/runtime/time.go:193 +0x12e
main.main()
/home/simon/panic/main.go:14 +0x6c
goroutine 18 [chan receive]:
main.main.func1()
/home/simon/panic/main.go:12 +0x76
created by main.main
/home/simon/panic/main.go:10 +0x5d
I thought maybe the channel was getting closed as panic runs (which still wouldn't guarantee the deferred fmt.Println had time to execute), but I get the same behaviour if the child goroutine does a time.Sleep instead of waiting on a channel.
I know there are ways to dump goroutine stacktraces myself, but my question is why doesn't panic behave as described in the book? The language spec only says that a panic will terminate the program, so is the book simply describing implementation-dependent behaviour?
Thanks to kostix for pointing me to the GOTRACEBACK runtime environment variable. Setting this to all instead of leaving it at the default of single restores the behaviour described in TGPL. Note that this variable is significant to the runtime, but you can't manipulate it with go env.
The default to only list the panicking goroutine is a change in go 1.6 - my edition of the book is copyrighted 2016 and gives go 1.5 as the prequisite for its example code, so it must predate the change. It's interesting reading the change discussion that there was concern about hiding useful information (as the recipient of many an incomplete error report, I can sympathise with this), but nobody called out the issue of scaling to large production systems that kostix mentioned.

Why code in loop not executed when I have two go-routines

I'm facing a problem in golang
var a = 0
func main() {
go func() {
for {
a = a + 1
}
}()
time.Sleep(time.Second)
fmt.Printf("result=%d\n", a)
}
expected: result=(a big int number)
result: result=0
You have a race condition,
run your program with -race flag
go run -race main.go
==================
WARNING: DATA RACE
Read at 0x0000005e9600 by main goroutine:
main.main()
/home/jack/Project/GoProject/src/gitlab.com/hooshyar/GoNetworkLab/StackOVerflow/race/main.go:17 +0x6c
Previous write at 0x0000005e9600 by goroutine 6:
main.main.func1()
/home/jack/Project/GoProject/src/gitlab.com/hooshyar/GoNetworkLab/StackOVerflow/race/main.go:13 +0x56
Goroutine 6 (running) created at:
main.main()
/home/jack/Project/GoProject/src/gitlab.com/hooshyar/GoNetworkLab/StackOVerflow/race/main.go:11 +0x46
==================
result=119657339
Found 1 data race(s)
exit status 66
what is solution?
There is some solution, A solution is using a mutex:
var a = 0
func main() {
var mu sync.Mutex
go func() {
for {
mu.Lock()
a = a + 1
mu.Unlock()
}
}()
time.Sleep(3*time.Second)
mu.Lock()
fmt.Printf("result=%d\n", a)
mu.Unlock()
}
before any read and write lock the mutex and then unlock it, now you don not have any race and resault will bi big int at the end.
For more information read this topic.
Data races in Go(Golang) and how to fix them
and this
Golang concurrency - data races
As other writers have mentioned, you have a data race, but if you are comparing this behavior to, say, a program written in C using pthreads, you are missing some important data. Your problem is not just about timing, it's about the very language definition. Because concurrency primitives are baked into the language itself, the Go language memory model (https://golang.org/ref/mem) describes exactly when and how changes in one goroutine -- think of goroutines as "super-lightweight user-space threads" and you won't be too far off -- are guaranteed to be visible to code running in another goroutine.
Without any synchronizing actions, like channel sends/receives or sync.Mutex locks/unlocks, the Go memory model says that any changes you make to 'a' inside that goroutine don't ever have to be visible to the main goroutine. And, since the compiler knows that, it is free to optimize away pretty much everything in your for loop. Or not.
It's a similar situation to when you have, say, a local int variable in C set to 1, and maybe you have a while loop reading that variable in a loop waiting for it to be set to 0 by an ISR, but then your compiler gets too clever and decides to optimize away the test for zero because it thinks your variable can't ever change within the loop and you really just wanted an infinite loop, and so you have to declare the variable as volatile to fix the 'bug'.
If you are going to be working in Go, (my current favorite language, FWIW,) take time to read and thoroughly grok the Go memory model linked above, and it will really pay off in the future.
Your program is running into race condition. go can detect such scenarios.
Try running your program using go run -race main.go assuming your file name is main.go. It will show how race occured ,
attempted write inside the goroutine ,
simultaneous read by the main goroutine.
It will also print a random int number as you expected.

Cannot understand go test -race : RACE: DATA WARNING stack trace

I ran into a DATA RACE warning while testing my project, and was wondering if anyone would be kind enough to help me decipher the problem. I have never attempted testing go routines in the past and am finding it hard to wrap my head around data races.
I have provided a link in the description to the open issue, with the trace in the issue description.
I would really appreciate some help, just from the aspect of learning to debug similar issues and writing better tests for go routines in the future.
https://github.com/nitishm/vegeta-server/issues/52
A snippet of the trace is provided below as well
=== RUN Test_dispatcher_Cancel_Error_completed
INFO[0000] creating new dispatcher component=dispatcher
INFO[0000] starting dispatcher component=dispatcher
INFO[0000] dispatching new attack ID=d63a79ac-6f51-486e-845d-077c8c76168a Status=scheduled component=dispatcher
==================
WARNING: DATA RACE
Read at 0x00c0000f8d68 by goroutine 8:
vegeta-server/internal/dispatcher.(*task).Complete()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:116 +0x61
vegeta-server/internal/dispatcher.run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:213 +0x17a
Previous write at 0x00c0000f8d68 by goroutine 7:
vegeta-server/internal/dispatcher.(*task).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:107 +0x12a
vegeta-server/internal/dispatcher.(*dispatcher).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher.go:109 +0xb5f
Goroutine 8 (running) created at:
vegeta-server/internal/dispatcher.(*task).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:105 +0x11c
vegeta-server/internal/dispatcher.(*dispatcher).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher.go:109 +0xb5f
Goroutine 7 (running) created at:
vegeta-server/internal/dispatcher.Test_dispatcher_Cancel_Error_completed()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher_test.go:249 +0x545
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
==================
==================
WARNING: DATA RACE
Write at 0x00c0000f8d98 by goroutine 8:
vegeta-server/internal/dispatcher.(*task).SendUpdate()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:164 +0x70
vegeta-server/internal/dispatcher.(*task).Complete()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:128 +0x20e
vegeta-server/internal/dispatcher.run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:213 +0x17a
Previous write at 0x00c0000f8d98 by goroutine 7:
vegeta-server/internal/dispatcher.(*task).SendUpdate()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:164 +0x70
vegeta-server/internal/dispatcher.(*task).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:109 +0x15d
vegeta-server/internal/dispatcher.(*dispatcher).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher.go:109 +0xb5f
Goroutine 8 (running) created at:
vegeta-server/internal/dispatcher.(*task).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/task.go:105 +0x11c
vegeta-server/internal/dispatcher.(*dispatcher).Run()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher.go:109 +0xb5f
Goroutine 7 (running) created at:
vegeta-server/internal/dispatcher.Test_dispatcher_Cancel_Error_completed()
/Users/nitishm/vegeta-server/internal/dispatcher/dispatcher_test.go:249 +0x545
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
==================
INFO[0002] canceling attack ID=d63a79ac-6f51-486e-845d-077c8c76168a ToCancel=true component=dispatcher
ERRO[0002] failed to cancel task ID=d63a79ac-6f51-486e-845d-077c8c76168a ToCancel=true component=dispatcher error="cannot cancel task d63a79ac-6f51-486e-845d-077c8c76168a with status completed"
WARN[0002] gracefully shutting down the dispatcher component=dispatcher
--- FAIL: Test_dispatcher_Cancel_Error_completed (2.01s)
testing.go:771: race detected during execution of test
As far as I can understand it:
Read at 0x00c0000f8d68 by goroutine 8: and Previous write at 0x00c0000f8d68 by goroutine 7
means that both goroutines 8 and 7 are reading from and writing to the same location. If you look at the lines pointed to by the error:
goroutine 8 on 116:
if t.status != models.AttackResponseStatusRunning {
goroutine 7 on 107:
t.status = models.AttackResponseStatusRunning
You can see that the goroutines are accessing the task's state without any synchronization and that, as you already know, can cause a race condition.
So if your program allows access to a single task by multiple goroutines you need to ensure that no data race occurs by using a mutex lock for example.

What happens if concurrent processes write to a global variable the same value?

I'm just wondering if there is potential for corruption as a result of writing the same value to a global variable at the same time. My brain is telling me there is nothing wrong with this because its just a location in memory, but I figure I should probably double check this assumption.
I have concurrent processes writing to a global map var linksToVisit map[string]bool. The map is actually tracking what links on a website need to be further crawled.
However it can be the case that concurrent processes may have the same link on their respective pages and therefore each will mark that same link as true concurrently. There's nothing wrong with NOT using locks in this case right? NOTE: I never change the value back to false so either the key exists and it's value is true or it doesn't exist.
I.e.
var linksToVisit = map[string]bool{}
...
// somewhere later a goroutine finds a link and marks it as true
// it is never marked as false anywhere
linksToVisit[someLink] = true
What happens if concurrent processes write to a global variable the
same value?
The results of a data race are undefined.
Run the Go data race detector.
References:
Wikipedia: Race condition
Benign Data Races: What Could Possibly Go Wrong?
The Go Blog: Introducing the Go Race Detector
Go: Data Race Detector
Go 1.8 Release Notes
Concurrent Map Misuse
In Go 1.6, the runtime added lightweight, best-effort detection of
concurrent misuse of maps. This release improves that detector with
support for detecting programs that concurrently write to and iterate
over a map.
As always, if one goroutine is writing to a map, no other goroutine
should be reading (which includes iterating) or writing the map
concurrently. If the runtime detects this condition, it prints a
diagnosis and crashes the program. The best way to find out more about
the problem is to run the program under the race detector, which will
more reliably identify the race and give more detail.
For example,
package main
import "time"
var linksToVisit = map[string]bool{}
func main() {
someLink := "someLink"
go func() {
for {
linksToVisit[someLink] = true
}
}()
go func() {
for {
linksToVisit[someLink] = true
}
}()
time.Sleep(100 * time.Millisecond)
}
Output:
$ go run racer.go
fatal error: concurrent map writes
$
$ go run -race racer.go
==================
WARNING: DATA RACE
Write at 0x00c000078060 by goroutine 6:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:190 +0x0
main.main.func2()
/home/peter/gopath/src/racer.go:16 +0x6a
Previous write at 0x00c000078060 by goroutine 5:
runtime.mapassign_faststr()
/home/peter/go/src/runtime/map_faststr.go:190 +0x0
main.main.func1()
/home/peter/gopath/src/racer.go:11 +0x6a
Goroutine 6 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:14 +0x88
Goroutine 5 (running) created at:
main.main()
/home/peter/gopath/src/racer.go:9 +0x5b
==================
fatal error: concurrent map writes
$
It is better to use locks if you are changing the same value concurrently using multiple go routines. Since mutex and locks are used whenever it comes to secure the value from accessing when another function is changing the same just like writing to database table while accessing the same table.
For your question on using maps with different keys it is not preferable in Go as:
The typical use of maps did not require safe access from multiple
goroutines, and in those cases where it did, the map was probably part
of some larger data structure or computation that was already
synchronized. Therefore requiring that all map operations grab a mutex
would slow down most programs and add safety to few.
Map access is unsafe only when updates are occurring. As long as all
goroutines are only reading—looking up elements in the map, including
iterating through it using a for range loop—and not changing the map
by assigning to elements or doing deletions, it is safe for them to
access the map concurrently without synchronization.
So In case of update of maps it is not recommended. For more information Check FAQ on why maps operations not defined atomic.
Also it is noticed that if you realy wants to go for there should be a way to synchronize them.
Maps are not safe for concurrent use: it's not defined what happens
when you read and write to them simultaneously. If you need to read
from and write to a map from concurrently executing goroutines, the
accesses must be mediated by some kind of synchronization mechanism.
One common way to protect maps is with sync.RWMutex.
Concurrent map write is not ok, so you will most likely get a fatal error. So I think a lock should be used
As of Go 1.6, simultaneous map writes will cause a panic. Use a sync.Map to synchronize access.
See the map value assign implementation:
https://github.com/golang/go/blob/fe8a0d12b14108cbe2408b417afcaab722b0727c/src/runtime/hashmap.go#L519

Different output in race detector for println and fmt.Println

I have a different output for println and fmt.Println in race detector which I couldn't explain. I expected both to be race, or at least both to be no race.
package main
var a int
func f() {
a = 1
}
func main() {
go f()
println(a)
}
And, it finds race condition as expected.
0
==================
WARNING: DATA RACE
Write by goroutine 5:
main.f()
/home/felmas/test.go:6 +0x30
Previous read by main goroutine:
main.main()
/home/felmas/test.go:11 +0x4d
Goroutine 5 (running) created at:
main.main()
/home/felmas/test.go:10 +0x38
==================
Found 1 data race(s)
However, this one runs without any detected race.
package main
import "fmt"
var a int
func f() {
a = 1
}
func main() {
go f()
fmt.Println(a)
}
To my knowledge, no race is detected doesn't mean there is no race so is this one of these deficiencies or is there a deeper explanation since println is builtin and quite special?
The race detector is a dynamic testing tool and no static analysis. In order to get reliable results from the race detector, you should strife for a high test coverage of your program, preferable by writing lots of benchmarks using multiple processes (by setting GOMAXPROCS > 1, GOMAXPROCS=NumCPU is the default for Go 1.5) and use a continuous integration tool that executes those tests regularly.
The race detector does not report any false positives so you should take every output serious. On the other hand it might not detect every race on every run, depending on the order goroutines and processes are scheduled.
In your example, wrapping everything in a tight loop and re-executing the tests reports the race correctly in both cases.

Resources