Getting data race condition with zerolog - go

I am using zerolog configured with diodes to prevent a race condition when writing to stdout. This is my log setup:
consoleWriter := zerolog.ConsoleWriter{Out: os.Stdout, NoColor: *logNoColor, TimeFormat: *logDateTimeFormat}
return diode.NewWriter(consoleWriter, 1000, 10*time.Millisecond, onMissedMessages)
I am following the example from here:
zerolog documentation
Finally, I set the global logger with the writer returned from above (f):
log.Logger = zerolog.New(f).With().Timestamp().Logger()
I have a shutdown hook (listens for CTRL+C) which basically just writes a log message when it is called and cancels the root context. I am also writing a log message after 1 second has elapsed using the time.After function.
When I don't press CTRL+C, the application runs as expected; however, when I press CTRL+C prior to the 1 second delayed execution of a method, I get a data race condition. When I remove the log statements, the problem goes away leading me to believe that the diode setup from above isn't helping prevent a race condition.
Just to make sure I'm understanding how golang works:
Canceling a context releases resources tied to that context. Ie. if you have a child context with a channel and close the child context, only that channel would be closed / released
os.Stdout, os.Stderr, etc. are not thread-safe and operations on them must be controlled by the developer.
I can check if a context is Done() from numerous goroutines without a synchronization problem. However, I can and should only read from a channel from a single thread

Related

Does Context.Done() unblock when context variable goes out of scope in golang?

Will context.Done() unblock when a context variable goes out of scope and cancel is not explicitly called?
Let's say I have the following code:
func DoStuff() {
ctx, _ := context.WithCancel(context.Background())
go DoWork(ctx)
return
}
Will ctx.Done() unblock in DoWork after the return in DoStuff()?
I found this thread, https://groups.google.com/forum/#!topic/golang-nuts/BbvTlaQwhjw, where the person asking how to use Context.Done() claims that context.Done() will unblock when the context variable leaves scope but no one validated this, and I didn't see anything in the docs.
No, it doesn't cancel automatically when the context leaves scope. Typically one calls defer cancel() (using the callback from ctx.WithCancel()) oneself to make sure that the context is cancelled.
https://blog.golang.org/context provides a good overview of how to use contexts correctly (including the defer pattern above). Also, the source code https://golang.org/src/context/context.go is quite readable and you can see there's no magic that would provide automatic cancellation.
"Unblocking" is not the clearest terminology. Done() returns a channel (or nil) that will receive a struct{} and/or close when the context is "cancelled". What exactly that chan is, or when it is sent on, is up to the individual implementation. It may be sent/closed at some fixed time as with WithDeadline, or manually done as with WithCancel.
The key though, is that this is never "automatic" or guaranteed to happen. If you make a context with WithCancel and read from the Done() channel, that read will block indefinitely until the Cancel() method is called. If that never happens, then you have a wasted goroutine and your application's memory will increase each time you do it.
Once the context is completely out of scope (no executing goroutine is listening to it or has a reference to the parent context), it will get garbage collected and everything will go away.
EDIT: After reading the source though, it looks like WithCancel and friends spawn goroutines to propigate the cancellation. Therefore you must make sure Cancel gets called at some point to avoid goroutine leaks.

Discrepancies between Go Playground and Go on my machine?

To settle some misunderstandings I have about goroutines, I went to the Go playground and ran this code:
package main
import (
"fmt"
)
func other(done chan bool) {
done <- true
go func() {
for {
fmt.Println("Here")
}
}()
}
func main() {
fmt.Println("Hello, playground")
done := make(chan bool)
go other(done)
<-done
fmt.Println("Finished.")
}
As I expected, Go playground came back with an error: Process took too long.
This seems to imply that the goroutine created within other runs forever.
But when I run the same code on my own machine, I get this output almost instantaneously:
Hello, playground.
Finished.
This seems to imply that the goroutine within other exits when the main goroutine finishes. Is this true? Or does the main goroutine finish, while the other goroutine continues to run in the background?
Edit: Default GOMAXPROCS has changed on the Go Playground, it now defaults to 8. In the "old" days it defaulted to 1. To get the behavior described in the question, set it to 1 explicitly with runtime.GOMAXPROCS(1).
Explanation of what you see:
On the Go Playground, GOMAXPROCS is 1 (proof).
This means one goroutine is executed at a time, and if that goroutine does not block, the scheduler is not forced to switch to other goroutines.
Your code (like every Go app) starts with a goroutine executing the main() function (the main goroutine). It starts another goroutine that executes the other() function, then it receives from the done channel - which blocks. So the scheduler must switch to the other goroutine (executing other() function).
In your other() function when you send a value on the done channel, that makes both the current (other()) and the main goroutine runnable. The scheduler chooses to continue to run other(), and since GOMAXPROCS=1, main() is not continued. Now other() launches another goroutine executing an endless loop. The scheduler chooses to execute this goroutine which takes forever to get to a blocked state, so main() is not continued.
And then the timeout of the Go Playground's sandbox comes as an absolution:
process took too long
Note that the Go Memory Model only guarantees that certain events happen before other events, you have no guarantee how 2 concurrent goroutines are executed. Which makes the output non-deterministic.
You are not to question any execution order that does not violate the Go Memory Model. If you want the execution to reach certain points in your code (to execute certain statements), you need explicit synchronization (you need to synchronize your goroutines).
Also note that the output on the Go Playground is cached, so if you run the app again, it won't be run again, but instead the cached output will be presented immediately. If you change anything in the code (e.g. insert a space or a comment) and then you run it again, it then will be compiled and run again. You will notice it by the increased response time. Using the current version (Go 1.6) you will see the same output every time though.
Running locally (on your machine):
When you run it locally, most likely GOMAXPROCS will be greater than 1 as it defaults to the number of CPU cores available (since Go 1.5). So it doesn't matter if you have a goroutine executing an endless loop, another goroutine will be executed simultaneously, which will be the main(), and when main() returns, your program terminates; it does not wait for other non-main goroutines to complete (see Spec: Program execution).
Also note that even if you set GOMAXPROCS to 1, your app will most likely exit in a "short" time as the scheduler imlementation will switch to other goroutines and not just execute the endless loop forever (however, as stated above, this is non-deterministic). And when it does, it will be the main() goroutine, and so when main() finishes and returns, your app terminates.
Playing with your app on the Go Playground:
As mentioned, by default GOMAXPROCS is 1 on the Go Playground. However it is allowed to set it to a higher value, e.g.:
runtime.GOMAXPROCS(2)
Without explicit synchronization, execution still remains non-deterministic, however you will observe a different execution order and a termination without running into a timeout:
Hello, playground
Here
Here
Here
...
<Here is printed 996 times, then:>
Finished.
Try this variant on the Go Playground.
What you will see on screen is nondeterministic. Or more precisely if by any chance the true value you pass to channel is delayed you would see some "Here".
But usually the Stdout is buffered, it means it's not printed instantaneously but the data gets accumulated and after it gets to maximum buffer size it's printed. In your case before the "here" is printed the main function is already finished thus the process finishes.
The rule of thumb is: main function must be alive otherwise all other goroutines gets killed.

How to use multiple sessions per connection in a multi-threaded application?

Suppose I have one connection c and many session objects s1, s2 .. sn, each working in different threads t1, t2 ... tn.
c
|
-------------------------------------------------
| | | |
(t1,s1) (t2,s2) (t3,s3) ...... (tn,sn)
Now suppose one of the thread t3 wants to send a message to a particular queue q3 and then listen to the reply asynchronously. So it does the following:
1: c.stop();
2: auto producer = s3.createProducer(s3.createQueue(q3));
3: auto text = s3.createTextMessage(message);
4: auto replyQueue = s3.createTemporaryQueue();
5: text.setJMSReplyTo(replyQueue);
6: producer.send(text);
7: auto consumer = s3.createConsumer(replyQueue);
8: consumer.setMessageListener(myListener);
9: c.start();
The reason why I called c.stop() in the beginning and then c.start() in the end, because I'm not sure if any of the other threads has called start on the connection (making all the sessions asynchronous — is that right?) and as per the documentation:
"If synchronous calls, such as creation of a consumer or producer, must be made on an asynchronous session, the Connection.Stop must be called. A session can be resumed by calling the Connection.Start method to start delivery of messages."
So calling stop in the beginning of the steps and then start in the end seems reasonable and thus the code seems correct (at least to me). However, when I thought about it more, I think the code is buggy, as it doesn't make sure no other threads call start before t3 finishes all the steps.
So my questions are:
Do I need to use mutex to ensure it? Or the XMS handles it automatically (which means my reasoning is wrong)?
How to design my application so that I dont have to call stop and start everytime I want to send a messages and listen reply asynchronously?
As per the quoted text above, I cannot call createProducer() and createConsumer() if the connection is in asynchronous mode. What are other methods I cannot call? The documentation doesn't categorise the methods in this way:
Also, the documentation doesn't say clearly what makes a session asynchronous. It says this:
"A session is not made asynchronous by assigning a message listener to a consumer. A session becomes asynchronous only when the Connection.Start method is called."
I see two problems here:
Calling c.start() makes all sessions asynchronous, not just one.
If I call c.start() but doesn't assign any message listener to a consumer, are the session(s) still asynchronous?
It seems I've lots of questions, so it'd be great if anyone could provide me with links to the parts or sections of the documentation which explains XMS objects with such minute details.
This says,
"According to the specification, calling stop(), close() on a Connection, setMessageListener() on a Session etc. must wait till all message processing finishes, that is till all onMessage() calls which have already been entered exit. So if anyone attempts to do that operation inside onMessage() there will be a deadlock by design."
But I'm not sure if that information is authentic, as I didn't find this info on IBM documentation.
I prefer the KIS rule. Why don't you use 1 connection per thread? Hence, the code would not have to worry about conflicts between threads.

What is correct way to close persistent connection?

My case is: long running server with connection to Redis. This server wait for SIGTERM signal for terminating. What is the right way to guarantee to release connection after terminating of my application?
I know about defer - is really great, but not for persistent connection, because I do not want to open connection to Redis for each operation.
Thanks!
You would still use defer if you want to ensure some block of code executes before exit. The difference is in it's scope. The scope of your connection and defer statement should be the same. I have no idea what your app is but to provide a concrete example, you need to defer the connection close in the main of you command line app, not in the methods that read and write.
You said "because I do not want to open connection to Redis for each operation" but that only makes defer problematic if you defer the close in the scope of some method that does a single IO operation. If you instead do the defer in the scope above a single operation (where all operations occur) then it will do waht you want;
init connection
defer connectionClose
begin execution of code that does db IO
block here if above is async
program is exiting, my defer is called here
EDIT: As pointed out in the comments, the execution of deferred statements in not guaranteed. I just want to make it clear that you can defer the connection closing at the top level of application.

Why is this Go code blocking?

I wrote the following program:
package main
import (
"fmt"
)
func processevents(list chan func()) {
for {
//a := <-list
//a()
}
}
func test() {
fmt.Println("Ho!")
}
func main() {
eventlist := make(chan func(), 100)
go processevents(eventlist)
for {
eventlist <- test
fmt.Println("Hey!")
}
}
Since the channel eventlist is a buffered channel, I think I should get at exactly 100 times the output "Hey!", but it is displayed only once. Where is my mistake?
Update (Go version 1.2+)
As of Go 1.2, the scheduler works on the principle of pre-emptive multitasking.
This means that the problem in the original question (and the solution presented below) are no longer relevant.
From the Go 1.2 release notes
Pre-emption in the scheduler
In prior releases, a goroutine that was looping forever could starve out other goroutines
on the same thread, a serious problem when GOMAXPROCS provided only one user thread.
In Go > 1.2, this is partially addressed: The scheduler is invoked occasionally upon
entry to a function. This means that any loop that includes a (non-inlined) function
call can be pre-empted, allowing other goroutines to run on the same thread.
Short answer
It is not blocking on the writes. It is stuck in the infinite loop of processevents.
This loop never yields to the scheduler, causing all goroutines to lock indefinitely.
If you comment out the call to processevents, you will get results as expected, right until the 100th write. At which point the program panics, because nobody reads from the channel.
Another solution is to put a call to runtime.Gosched() in the loop.
Long answer
With Go1.0.2, Go's scheduler works on the principle of Cooperative multitasking.
This means that it allocates CPU time to the various goroutines running within a given OS thread by having these routines interact with the scheduler in certain conditions.
These 'interactions' occur when certain types of code are executed in a goroutine.
In go's case this involves doing some kind of I/O, syscalls or memory allocation (in certain conditions).
In the case of an empty loop, no such conditions are ever encountered. The scheduler is therefore never allowed to run its scheduling algorithms for as long as that loop is running. This consequently prevents it from allotting CPU time to other goroutines waiting to be run and the result you observed ensues: You effectively created a deadlock that can not be detected or broken out of by the scheduler.
The empty loop is usually never desired in Go and will, in most cases, indicate a bug in the program. If you do need it for whatever reason, you have to manually yield to the scheduler by calling runtime.Gosched() in every iteration.
for {
runtime.Gosched()
}
Setting GOMAXPROCS to a value > 1 was mentioned as a solution. While this will get rid of the immediate problem you observed, it will effectively move the problem to a different OS thread, if the scheduler decides to move the looping goroutine to its own OS thread that is. There is no guarantee of this, unless you call runtime.LockOSThread() at the start of the processevents function. Even then, I would still not rely on this approach to be a good solution. Simply calling runtime.Gosched() in the loop itself, will solve all the issues, regardless of which OS thread the goroutine is running in.
Here is another solution - use range to read from the channel. This code will yield to the scheduler correctly and also terminate properly when the channel is closed.
func processevents(list chan func()) {
for a := range list{
a()
}
}
Good news, since Go 1.2 (december 2013) the original program now works as expected.
You may try it on Playground.
This is explained in the Go 1.2 release notes, section "Pre-emption in the scheduler" :
In prior releases, a goroutine that was looping forever could starve
out other goroutines on the same thread, a serious problem when
GOMAXPROCS provided only one user thread. In Go 1.2, this is partially
addressed: The scheduler is invoked occasionally upon entry to a
function.

Resources