Data race occurs with xml.Marshal - go

I'm getting a strange data race reported when I use xml.Marshal on a struct. I am 99% sure that I am not sending the same variable or something like that - instead I get intermittent errors in which the marshal function believes that it is causing a data race.
Here's the code (simplified a bit but all functional elements are there):
// this is run prior to any calls being sent to the below functions
func Setup() (descriptor *serviceDescriptor) {
descriptor = new(serviceDescriptor)
wpChan := make(chan *Call)
for i := 1; i < 100; i++ {
go serviceWorker(wpChan)
}
descriptor.wpChan = wpChan
return
}
// called externally to initiate a call
func (s *serviceDescriptor) Add(c *Call) {
go s.makeCall(c)
}
// sends a call to the worker pool set up in the Setup() function
func (s *serviceDescriptor) makeCall(c *Call) {
cw := new(callwrapper)
cw.internal = new(etFullCall)
cw.internal.Calldata = c
/// this is a channel that the next function listens to
s.wpChan <- ct
// the result is sent back on a channel and processed here
}
// this is a function
func worker(wpChan chan *callwrapper) {
for cw := range wpChan {
v := new(Result)
// this is where the data race occurs.
byt, err := xml.Marshal(cw.internal)
// do stuff with the result down here
}
}
Here's the error:
WARNING: DATA RACE
Write by goroutine 115:
runtime.copy()
/usr/lib/go/src/pkg/runtime/slice.c:120 +0x0
encoding/xml.(*parentStack).push()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:901 +0x2bd
encoding/xml.(*printer).marshalStruct()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:819 +0x58c
encoding/xml.(*printer).marshalValue()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:524 +0x12a8
encoding/xml.(*Encoder).Encode()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:153 +0x83
encoding/xml.Marshal()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:72 +0x9d
exacttarget.serviceWorker()
/service.go:94 +0x20e
Previous write by goroutine 114:
runtime.copy()
/usr/lib/go/src/pkg/runtime/slice.c:120 +0x0
encoding/xml.(*parentStack).push()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:901 +0x2bd
encoding/xml.(*printer).marshalStruct()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:819 +0x58c
encoding/xml.(*printer).marshalValue()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:524 +0x12a8
encoding/xml.(*Encoder).Encode()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:153 +0x83
encoding/xml.Marshal()
/usr/lib/go/src/pkg/encoding/xml/marshal.go:72 +0x9d
exacttarget.serviceWorker()
/service.go:94 +0x20e
Goroutine 115 (running) created at:
service.Setup()
/service.go:39 +0x112
/src.processSingleUser()
/main_test.go:405 +0xdf
/src.testAddLists()
/main_test.go:306 +0x1f2
Goroutine 114 (running) created at:
service.Setup()
/service.go:39 +0x112
/src.processSingleUser()
/main_test.go:405 +0xdf
src.testAddLists()
/main_test.go:306 +0x1f2

Related

Counter that rotates that is safe for concurrent use

I want to write something like this:
type RoundRobinList struct {
lst []string
idx uint32
}
// rr_list is a RoundRobinList
func loop(rr_list) {
start = rr_list.idx
rr_list.idx = (rr_list.idx + 1)%len(rr_list.lst)
print(rr_list.lst[idx])
}
If rr_list.lst = ["a", "b", "c"], and loop is called over and over, I would expect the following printed:
"a"
"b"
"c"
"a"
"b"
"c" ...
Is this safe? rr_list.idx = (rr_list.idx + 1)%len(rr_list.lst)
Whenever you have a read and write of a value that's not protected in some way (using a mutex, or some of the other things in the sync package), you have a race, and that means you can't use it safely in a concurrent setting.
Here, you're reading and writing the idx field of your RoundRobinList structure without any protection, so you'll have a race if you use it concurrently.
As a first line of defense you should understand the rules for memory safety and follow them carefully, and not write code unless you're pretty sure it's safe. As a second line of defense, you can use the race detector to find a lot of problems with lack of safety of concurrent access.
Here's a simple test case, in file a_test.go. I had to also fix some bugs in the code to get it to compile. It starts two goroutines which call loop 1000 times each on a shared RoundRobinList.
package main
import (
"sync"
"testing"
)
type RoundRobinList struct {
lst []string
idx uint32
}
func loop(rr *RoundRobinList) {
rr.idx = (rr.idx + 1) % uint32(len(rr.lst))
}
func TestLoop(t *testing.T) {
rr := RoundRobinList{[]string{"a", "b", "c"}, 0}
var wg sync.WaitGroup
wg.Add(2)
for n := 0; n < 2; n++ {
go func() {
for i := 0; i < 1000; i++ {
loop(&rr)
}
wg.Done()
}()
}
wg.Wait()
}
Running with go test -race ./a_test.go results in:
==================
WARNING: DATA RACE
Read at 0x00c0000b0078 by goroutine 9:
command-line-arguments.loop()
/mnt/c/Users/paul/Desktop/a_test.go:14 +0xa1
command-line-arguments.TestLoop.func1()
/mnt/c/Users/paul/Desktop/a_test.go:24 +0x78
Previous write at 0x00c0000b0078 by goroutine 8:
command-line-arguments.loop()
/mnt/c/Users/paul/Desktop/a_test.go:14 +0x4e
command-line-arguments.TestLoop.func1()
/mnt/c/Users/paul/Desktop/a_test.go:24 +0x78
Goroutine 9 (running) created at:
command-line-arguments.TestLoop()
/mnt/c/Users/paul/Desktop/a_test.go:22 +0x1cc
testing.tRunner()
/usr/local/go/src/testing/testing.go:992 +0x1eb
Goroutine 8 (finished) created at:
command-line-arguments.TestLoop()
/mnt/c/Users/paul/Desktop/a_test.go:22 +0x1cc
testing.tRunner()
/usr/local/go/src/testing/testing.go:992 +0x1eb
==================
--- FAIL: TestLoop (0.00s)
testing.go:906: race detected during execution of test
FAIL
FAIL command-line-arguments 0.006s
FAIL

How to make sure code has no data races in Go?

I'm writing a microservice that calls other microservices, for data that rarely updates (once in a day or once in a month). So I decided to create cache, and implemented this interface:
type StringCache interface {
Get(string) (string, bool)
Put(string, string)
}
internally it's just map[string]cacheItem, where
type cacheItem struct {
data string
expire_at time.Time
}
My coworker says that it's unsafe, and I need add mutex locks in my methods, because it will be used in parallel by different instances of http handler functions. I have a test for it, but it detects no data races, because it uses cache in one goroutine:
func TestStringCache(t *testing.T) {
testDuration := time.Millisecond * 10
cache := NewStringCache(testDuration / 2)
cache.Put("here", "this")
// Value put in cache should be in cache
res, ok := cache.Get("here")
assert.Equal(t, res, "this")
assert.True(t, ok)
// Values put in cache will eventually expire
time.Sleep(testDuration)
res, ok = cache.Get("here")
assert.Equal(t, res, "")
assert.False(t, ok)
}
So, my question is: how to rewrite this test that it detects data race (if it is present) when running with go test -race?
First thing first, the data race detector in Go is not some sort of formal prover which uses static code analysis but is rather a dynamic tool which instruments the compiled code in a special way to try to detect data races at runtime.
What this means is that if the race detector is lucky and it spots a data race, you ought to be sure there is a data race at the reported spot. But this also means that if the actual program flow did not make certain existing data race condition happen, the race detector won't spot and report it.
In oher words, the race detector does not have false positives but it is merely a best-effort tool.
So, in order to write race-free code you really have to rethink your approach.
It's best to start with this classic essay on the topic written by the author of the Go race detector, and once you have absorbed that there is no benign data races, you basically just train yourself to think about concurrently running theads of execution accessing your data each time you're architecting the data and the algorithms to manipulate it.
For instance, you know (at least you should know if you have read the docs) that each incoming request to an HTTP server implemented using net/http is handled by a separate goroutine.
This means, that if you have a central (shared) data structure such as a cache which is to be accessed by the code which processes client requests, you do have multiple goroutines potentially accessing that shared data concurrently.
Now if you have another goroutine which updates that data, you do have a potential for a classic data race: while one goroutine is updating the data, another may read it.
As to the question at hand, two things:
First, Never ever use timers to test stuff. This does not work.
Second, for such a simple case as yours, using merely two goroutines completely suffices:
package main
import (
"testing"
"time"
)
type cacheItem struct {
data string
expire_at time.Time
}
type stringCache struct {
m map[string]cacheItem
exp time.Duration
}
func (sc *stringCache) Get(key string) (string, bool) {
if item, ok := sc.m[key]; !ok {
return "", false
} else {
return item.data, true
}
}
func (sc *stringCache) Put(key, data string) {
sc.m[key] = cacheItem{
data: data,
expire_at: time.Now().Add(sc.exp),
}
}
func NewStringCache(d time.Duration) *stringCache {
return &stringCache{
m: make(map[string]cacheItem),
exp: d,
}
}
func TestStringCache(t *testing.T) {
cache := NewStringCache(time.Minute)
ch := make(chan struct{})
go func() {
cache.Put("here", "this")
close(ch)
}()
_, _ = cache.Get("here")
<-ch
}
Save this as sc_test.go and then
tmp$ go test -race -c -o sc_test ./sc_test.go
tmp$ ./sc_test
==================
WARNING: DATA RACE
Write at 0x00c00009e270 by goroutine 8:
runtime.mapassign_faststr()
/home/kostix/devel/golang-1.13.6/src/runtime/map_faststr.go:202 +0x0
command-line-arguments.(*stringCache).Put()
/home/kostix/tmp/sc_test.go:27 +0x144
command-line-arguments.TestStringCache.func1()
/home/kostix/tmp/sc_test.go:46 +0x62
Previous read at 0x00c00009e270 by goroutine 7:
runtime.mapaccess2_faststr()
/home/kostix/devel/golang-1.13.6/src/runtime/map_faststr.go:107 +0x0
command-line-arguments.TestStringCache()
/home/kostix/tmp/sc_test.go:19 +0x125
testing.tRunner()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:909 +0x199
Goroutine 8 (running) created at:
command-line-arguments.TestStringCache()
/home/kostix/tmp/sc_test.go:45 +0xe4
testing.tRunner()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:909 +0x199
Goroutine 7 (running) created at:
testing.(*T).Run()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:960 +0x651
testing.runTests.func1()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:1202 +0xa6
testing.tRunner()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:909 +0x199
testing.runTests()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:1200 +0x521
testing.(*M).Run()
/home/kostix/devel/golang-1.13.6/src/testing/testing.go:1117 +0x2ff
main.main()
_testmain.go:44 +0x223
==================
--- FAIL: TestStringCache (0.00s)
testing.go:853: race detected during execution of test
FAIL

What is causing this data race?

Why does this code cause data race?
I have already used atomic add.
package main
import (
"sync/atomic"
"time"
)
var a int64
func main() {
for {
if a < 100 {
atomic.AddInt64(&a, 1)
go run()
}
}
}
func run() {
<-time.After(5 * time.Second)
atomic.AddInt64(&a, -1)
}
I run command go run --race with this code and get:
==================
WARNING: DATA RACE
Write at 0x000001150f30 by goroutine 8:
sync/atomic.AddInt64()
/usr/local/Cellar/go/1.11.2/libexec/src/runtime/race_amd64.s:276 +0xb
main.run()
/Users/flask/test.go:22 +0x6d
Previous read at 0x000001150f30 by main goroutine:
main.main()
/Users/flask/test.go:12 +0x3a
Goroutine 8 (running) created at:
main.main()
/Users/flask/test.go:15 +0x75
==================
Could you help me explain this?
And how to fix this warning?
Thanks!
You didn't use the atomic package at all places where you accessed the variable. All access must be synchronized to variables that are accessed from multiple goroutines concurrently, including reads:
for {
if value := atomic.LoadInt64(&a); value < 100 {
atomic.AddInt64(&a, 1)
go run()
}
}
With that change, the race condition goes away.
If you just want to inspect the value, you don't even need to store it in a variable, so you may simply do:
for {
if atomic.LoadInt64(&a) < 100 {
atomic.AddInt64(&a, 1)
go run()
}
}

Why does the method of a struct that does not read/write its contents still cause a race case?

From the Dave Cheney Blog, the following code apparently causes a race case that can be resolved merely by changing func (RPC) version() int to func (*RPC) version() int :
package main
import (
"fmt"
"time"
)
type RPC struct {
result int
done chan struct{}
}
func (rpc *RPC) compute() {
time.Sleep(time.Second) // strenuous computation intensifies
rpc.result = 42
close(rpc.done)
}
func (RPC) version() int {
return 1 // never going to need to change this
}
func main() {
rpc := &RPC{done: make(chan struct{})}
go rpc.compute() // kick off computation in the background
version := rpc.version() // grab some other information while we're waiting
<-rpc.done // wait for computation to finish
result := rpc.result
fmt.Printf("RPC computation complete, result: %d, version: %d\n", result, version)
}
After looking over the code a few times, I was having a hard time believing that the code had a race case. However, when running with --race, it claims that there was a write at rpc.result=42 and a previous read at version := rpc.version(). I understand the write, since the goroutine changes the value of rpc.result, but what about the read? Where in the version() method does the read occur? It does not touch any of the values of rpc, just returning 1.
I would like to understand the following:
1) Why is that particular line considered a read on the rpc struct?
2) Why would changing RPC to *RPC resolve the race case?
When you have a method with value receiver like this:
func (RPC) version() int {
return 1 // never going to need to change this
}
And you call this method:
version := rpc.version() // grab some other information while we're waiting
A copy has to be made from the value rpc, which will be passed to the method (used as the receiver value).
So while one goroutine go rpc.compute() is running and is modifying the rpc struct value (rpc.result = 42), the main goroutine is making a copy of the whole rpc struct value. There! It's a race.
When you modify the receiver type to pointer:
func (*RPC) version() int {
return 1 // never going to need to change this
}
And you call this method:
version := rpc.version() // grab some other information while we're waiting
This is a shorthand for
version := (&rpc).version()
This passes the address of the rpc value to RPC.version(), it uses only the pointer as the receiver, so no copy is made of the rpc struct value. And since nothing from the struct is used / read in RPC.version(), there is no race.
Note:
Note that if RPC.version() would read the RPC.result field, it would also be a race, as one goroutine modifies it while the main goroutine would read it:
func (rpc *RPC) version() int {
return rpc.result // RACE!
}
Note #2:
Also note that if RPC.version() would read another field of RPC which is not modified in RPC.compute(), that would not be a race, e.g.:
type RPC struct {
result int
done chan struct{}
dummy int
}
func (rpc *RPC) version() int {
return rpc.dummy // Not a race
}

Go - Error when compiling a group of functions

I'm trying to implement a very simple test function to verify results coming from my solutions for Euler problems.
In the following code I've created a map of slices where on the index 0, I call the function which return a integer and on the index 1, the result I expect from that function.
package euler
import "testing"
func TestEulers(t *testing.T) {
tests := map[string][]int{
"Euler1": {Euler1(), 233168},
"Euler2": {Euler2(), 4613732},
"Euler3": {Euler3(), 6857},
"Euler4": {Euler4(), 906609},
"Euler5": {Euler5(), 232792560},
"Euler6": {Euler6(), 25164150},
}
for key, value := range tests {
if value[0] != value[1] {
t.Errorf("%s\nExpected: %d\nGot:%d",
key, value[0], value[1])
}
}
}
For that map, every function works fine and return the result I expect if I run one by one or if I comment, let's say, half part of those keys/values.
For example, if I call the the function above with these lines commented the test will PASS.
tests := map[string][]int{
"Euler1": {Euler1(), 233168},
// "Euler2": {Euler2(), 4613732},
"Euler3": {Euler3(), 6857},
"Euler4": {Euler4(), 906609},
// "Euler5": {Euler5(), 232792560},
// "Euler6": {Euler6(), 25164150},
}
But if I arrange the comments on that next way, for example, the test wouldn't.
tests := map[string][]int{
//"Euler1": {Euler1(), 233168},
"Euler2": {Euler2(), 4613732},
"Euler3": {Euler3(), 6857},
"Euler4": {Euler4(), 906609},
//"Euler5": {Euler5(), 232792560},
// "Euler6": {Euler6(), 25164150},
}
The test will give me an error:
WARNING: DATA RACE
Write by goroutine 6:
runtime.closechan()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/runtime/chan.go:295 +0x0
github.com/alesr/project-euler.Euler2()
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler.go:40 +0xd7
github.com/alesr/project-euler.TestEulers()
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler_test.go:9 +0x46
testing.tRunner()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:456 +0xdc
Previous read by goroutine 7:
runtime.chansend()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/runtime/chan.go:107 +0x0
github.com/alesr/numbers.FibonacciGen.func1()
/Users/Alessandro/GO/src/github.com/alesr/numbers/numbers.go:103 +0x59
Goroutine 6 (running) created at:
testing.RunTests()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:561 +0xaa3
testing.(*M).Run()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:494 +0xe4
main.main()
github.com/alesr/project-euler/_test/_testmain.go:54 +0x20f
Goroutine 7 (running) created at:
github.com/alesr/numbers.FibonacciGen()
/Users/Alessandro/GO/src/github.com/alesr/numbers/numbers.go:105 +0x60
github.com/alesr/project-euler.Euler2()
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler.go:27 +0x32
github.com/alesr/project-euler.TestEulers()
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler_test.go:9 +0x46
testing.tRunner()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:456 +0xdc
==================
panic: send on closed channel
goroutine 36 [running]:
github.com/alesr/numbers.FibonacciGen.func1(0xc8200a01e0)
/Users/Alessandro/GO/src/github.com/alesr/numbers/numbers.go:103 +0x5a
created by github.com/alesr/numbers.FibonacciGen
/Users/Alessandro/GO/src/github.com/alesr/numbers/numbers.go:105 +0x61
goroutine 1 [chan receive]:
testing.RunTests(0x24d038, 0x2f7340, 0x1, 0x1, 0xf78401)
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:562 +0xafa
testing.(*M).Run(0xc82004df00, 0x1ff0e8)
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:494 +0xe5
main.main()
github.com/alesr/project-euler/_test/_testmain.go:54 +0x210
goroutine 17 [syscall, locked to thread]:
runtime.goexit()
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/runtime/asm_amd64.s:1696 +0x1
goroutine 35 [runnable]:
github.com/alesr/strings.Flip(0xc8200727a0, 0x6, 0x0, 0x0)
/Users/Alessandro/GO/src/github.com/alesr/strings/strings.go:33 +0x17e
github.com/alesr/project-euler.Euler4(0x1ac9)
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler.go:73 +0x95
github.com/alesr/project-euler.TestEulers(0xc8200b6000)
/Users/Alessandro/GO/src/github.com/alesr/project-euler/euler_test.go:11 +0x63
testing.tRunner(0xc8200b6000, 0x2f7340)
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:456 +0xdd
created by testing.RunTests
/private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/workdir/go/src/testing/testing.go:561 +0xaa4
exit status 2
FAIL github.com/alesr/project-euler 0.022s
But still, I checked every single function and they work just as expected.
You can access the Euler source code or the packages numbers and strings if you want.
At Euler2 function I have a defer statement to close the channel which is receiving from FibonacciGen.
And on FibonacciGen I do have another defer statement to close the same channel.
It seems that's the my first error. I should have just one and not two statements to close the channel, since they are trying to close the same thing. Is that correct?
Second (and here I'm even a little more unsure), the defer statement will prevent the function to be called until the main goroutine returns, right? Independently if I call it on the package main or not?
Plus, since the data is flowing through the channel from FibonacciGen to the main function. It seems for me, that if I close the channel at FibonacciGen I don't need to notify the main function. But If I close the channel on the main function I do have to notify FibonacciGen to stop trying to send to this channel.
In your Euler2() you don't check if the channel has been closed. Once it's closed it's unblocked, so it tries to send a value to a now closed channel.
If you only run Euler2() your program might just exit before you send the value to the closed channel.
Thank you all. With your help I could understand that I was closing the channel in the wrong way.
Now works correctly.
func Euler2() int {
c := make(chan int)
done := make(chan bool)
go numbers.FibonacciGen(c, done)
sum := 0
var f int
for {
f = <-c
if f < 4000000 {
if f%2 == 0 {
sum += f
}
} else {
close(done)
return sum
}
}
}
func FibonacciGen(c chan int, done chan bool) {
for {
select {
case <-done:
return
default:
for i, j := 0, 1; ; i, j = i+j, i {
c <- i
}
}
}
}

Resources