I have a requirements where I need to do multiple things (irrelevant here) at some regular intervals. I achieved it using the code block mentioned below -
func (processor *Processor) process() {
defaultTicker := time.NewTicker(time.Second*2)
updateTicker := time.NewTicker(time.Second*5)
heartbeatTicker := time.NewTicker(time.Second*5)
timeoutTicker := time.NewTicker(30*time.Second)
refreshTicker := time.NewTicker(2*time.Minute)
defer func() {
logger.Info("processor for ", processor.id, " exited")
defaultTicker.Stop()
timeoutTicker.Stop()
updateTicker.Stop()
refreshTicker.Stop()
heartbeatTicker.Stop()
}()
for {
select {
case <-defaultTicker.C:
// spawn some go routines
case <-updateTicker.C:
// do something
case <-timeoutTicker.C:
// do something else
case <-refreshTicker.C:
// log
case <-heartbeatTicker.C:
// push metrics to redis
}
}
}
But I noticed that every once in a while, my for select loop gets stuck somewhere and I cannot seem to find where or why. By stuck I mean I stop receiving refresh ticker logs. But it starts working again normally in some time (5-10 mins)
I have made sure that all operations within each ticker completes within very little amount of time (~0ms, checked by putting logs).
My questions:
Is using multiple tickers in single select a good/normal practice (honestly I did not find many examples using multiple tickers online)
Anyone aware of any known issues/pitfalls where tickers can block the loop for longer duration.
Any help is appreciated. Thanks
Go does not provide any smart draining behavior for multiple channels, e.g., that older messages in one channel would get processed earlier than more recent messages in other channels. Anytime the loop enters the select statement a random channel is chosen.
Also see this answer and read the part about GOMAXPROCS=1. This could be related to your issue. The issue could also be in your logging package. Maybe the logs are just delayed.
In general, I think the issue must be in your case statements. Either you have a blocking function or some dysfunctional code. (Note: confirmed by the OP)
But to answer your questions:
1. Is using multiple tickers in single select a good/normal practice?
It is common to read from multiple channels randomly in a blocking way, one message at a time, e.g., to sort incoming data from multiple channels into a slice or map and avoid concurrent data access.
It is also common to add one or more tickers, e.g., to flush data and for logging or reporting. Usually the non-ticker code paths will do most of the work.
In your case, you use tickers that will run code paths that should block each other, which is a very specific use case but may be required in some scenarios. This uncommon, but not bad practice I think.
As the commenters suggested, you could also schedule different recurring tasks in separate goroutines.
2. Is anyone aware of any known issues/pitfalls where tickers can block the loop for longer duration?
The tickers themselves will not block the loop in any hidden way. The fastest ticker will always ensure the loop is looping at the speed of this ticker at least.
Note that the docs of time.NewTicker say:
The ticker will adjust the time interval or drop ticks to make up for
slow receivers
This just means, internally no new ticks are scheduled until you have consumed the last one from the single-element ticker channel.
In your example, the main pitfall is that any code in the case statements will block and thus delay the other cases.
If this is intended, everything is fine.
There may be other pitfalls if you have Microsecond or Nanosecond tickers where you may see some measurable runtime overhead or if you have hundreds of tickers and case blocks. But then you should have chose another scheduling pattern from the beginning.
Related
So I a attempting to shut down a long running function if something takes too long, maybe is just a solution to treating the symptoms rather than cause, but in any case for my situation it didn't really worked out.
I did it like this:
func foo(abort <- chan struct{}) {
for {
select{
case <-abort:
return
default:
///long running code
}
}
}
And in separate function I have which after some time closes the passed chain, which it does, if I cut the body returns the function. However if there is some long running code, it does not affect the outcome it simply continues the work as if nothing has happened.
I am pretty new to GO, but it feels like it should work, but it does not. Is there anything I am missing. After all routers frameworks have timeout function, after which whatever is running is terminated. So maybe this is just out of curiosity, but I would really want how to od it.
your code only checks whether the channel was closed once per iteration, before executing the long running code. There's no opportunity to check the abort chan after the long running code starts, so it will run to completion.
You need to occasionally check whether to exit early in the body of the long running code, and this is more idiomatically accomplished using context.Context and WithTimeout for example: https://pkg.go.dev/context#example-WithTimeout
In your "long running code" you have to periodically check that abort channel.
The usual approach to implement that "periodically" is to split the code into chunks each of which completes in a reasonably short time frame (given that the system the process runs on is not overloaded).
After executing each such chunk you check whether the termination condition holds and then terminate execution if it is.
The idiomatic approach to perform such a check is "select with default":
select {
case <-channel:
// terminate processing
default:
}
Here, the default no-op branch is immediately taken if channel is not ready to be received from (or closed).
Some alogrithms make such chunking easier because they employ a loop where each iteration takes roughly the same time to execute.
If your algorithm is not like this, you'd have to chunk it manually; in this case, it's best to create a separate function (or a method) for each chunk.
Further points.
Consider using contexts: they provide a useful framework to solve the style of problems like the one you're solving.
What's better, the fact they can "inherit" one another allow one to easily implement two neat things:
You can combine various ways to cancel contexts: say, it's possible to create a context which is cancelled either when some timeout passes or explicitly by some other code.
They make it possible to create "cancellation trees" — when cancelling the root context propagates this signal to all the inheriting contexts — making them cancel what other goroutines are doing.
Sometimes, when people say "long-running code" they do not mean code actually crunching numbers on a CPU all that time, but rather the code which performs requests to slow entities — such as databases, HTTP servers etc, — in which case the code is not actually running but sleeping on the I/O to deliver some data to be processed.
If this is your case, note that all well-written Go packages (of course, this includes all the packages of the Go standard library which deal with networked services) accept contexts in those functions of their APIs which actually make calls to such slow entities, and this means that if you make your function to accept a context, you can (actually should) pass this context down the stack of calls where applicable — so that all the code you call can be cancelled in the same way as yours.
Further reading:
https://go.dev/blog/pipelines
https://blog.golang.org/advanced-go-concurrency-patterns
I am learning Go by example. I've just implemented a select to await multiple channels as follows:
for i := 0; i < 2; i++ {
select {
case msg1 := <-c1:
fmt.Println("received", msg1)
case msg2 := <-c2:
fmt.Println("received", msg2)
}
}
With a little experimentation I've found that I can naively introduce runtime errors as follows:
If I reduce i to 1, the first message is received, but the second is silently lost (there is no indication that I unwittingly ignored it).
If I increase i to 3, both messages are received but I get fatal error: all goroutines are asleep - deadlock!
Reading ahead and searching for that error message on StackOverflow I can see that WaitGroups account for these types of issues. But they don't seem to apply to select, so I feel like I must be missing something.
Is there a language construct (like if/then/else) or software pattern that I can use to prevent or mitigate against these errors in real-world code?
Conceptually you mitigate against this by designing software correctly. If you have two channels, and each channel will receive at most one message, don't try to read from them 3 times. This is no different than trying to put three items in a two element array, or trying to divide two numbers where the divisor is 0. In all these cases languages offer ways of discovering and recovering from the error, but if you're actually producing these errors, it indicates a logic or design flaw.
You need to make sure that your channels have a balanced number of reads and writes, and that the sending end closes the channel when it has nothing else to send so receivers can stop waiting for messages that won't come. Otherwise you'll eventually have something stuck waiting, or messages in a buffer that are ignored.
In this very specific case, if you want to read from both channels but only if a message is ready, you can add a default case which will be invoked if no channel is ready for reading, but that's for situations where your channels are not ready yet but will eventually become ready. Providing a default is not a good solution to cover over bugs where channels will never become ready yet you're still trying to read from them; that indicates a logic-level flaw that needs to be fixed.
I need several functions to have the same channel as a parameter and take the same data, simultaneously.
Each of these functions has an independent task from each other, but they start from the same value.
For example, given a slice of integers, one function calculates the sum of its values and another calculates the average, at the same time. They would be goroutines.
One solution would be to create multiple channels from one value, but I want to avoid that. I might have to add or remove functions and for this, I would have to add or remove channels.
I think I understand that the Fan Out pattern could be an option, but I can't quite understand its implementation.
The question is against the rules of SO—as it does not present any concrete problem to be helped with but rather requests a tutoring session.
Anyway, two pointers for further research: basically—given the property of channel that each receive consumes a value sent to it, so it's impossible to read a once sent value multiple times,—such problems have two approaches to their solutions.
The first approach, which is what called a "fan-out", is to have all the consumers have a "personal" dedicated channel, copy the value to be broadcast as many times as there are consumers and send each copy to each of those dedicated channels.
The ostensibly most natural way to implement this is to have a single channel to which the producer sends its units of work—not caring how much consumers are to read them—and then have a dedicated goroutine receive those units of work, copy each of them and send the copies out to the dedicated channels of the consumers.
The second approach is to go lower level and implement basically the same scheme using stuff from the sync package.
One can think of the following scheme:
Have a custom struct type which has a sync.Mutex protecting the type's state.
Have a field which keeps the value multiple consumers have to read.
Have a counter in that type.
Have a sync.Cond in that type as well.
Have a channel with capacity there 1 as well.
Communicating a new value to the consumers looks like this:
Lock the mutex.
Verify the counter is 0, panic otherwise.
Write the new value into the respective field.
Set the counter to the number of consumers.
Unlock the mutex.
Pulse the sync.Cond.
The consumers are supposed to sleep in a wait call on that sync.Cond.
Once the sender pulses it, the goroutines running the code of consumers get woken up and try to read the value.
Reading of the value rolls like this:
Lock the mutex.
Verify the counter is greater than zero, panic otherwise.
Read the value.
Decrement the counter by one.
If the counter becomes 0, send on that special channel.
Unlock the mutex.
The channel is needed to communicate to the sender that all the consumers are done with their reads: before attempting to send the new value the consumer has to read from that channel.
As you can probably see, the second approach is way more involved and hard to get right, so I'd recommend to go with the first one.
I would also note that you seem to lack certain background knowledge on how to go around implementing concurrently running and communicating tasks.
I hereby recommend reading The Book and at least these chapters of The Blog:
Go Concurrency Patterns: Pipelines and cancellation.
Go Concurrency Patterns: Timing out, moving on
Advanced Go Concurrency Patterns
I have some question regarding difference between plain go func and for loop in go func:
Plain go Func:
func asyncTask(){
//...something
}
in order to trigger asyncTask, we can simply:
func main(){
go asyncTask()
}
make a for loop to monitor channel:
func (c *Container) asyncTask(){
go func(){
for {
select {
case <- c.someChan:
//...do something
case <-c.ctx.Done():
//...prevent leaking
}
}
}()
}
to trigger:
func (c *Container) trigger(){
c.someChan <- val
}
My questions are:
I understand second scenario most fit the case when we wish to manage async task in a queue.
But speaking for performance out of frequently triggered async task (which cannot be block), which method is better?
Is there any best practice in general to handle async task in GoLang?
In nearly any case, performance is not the thing to think about in choosing which pattern to use (both will be fine), but which usage makes sense in your specific use case. If you use pattern (1), then you are not guaranteed the sequential processing of (2). That is the essential difference between your two examples. So for an http server for example, you would use the former pattern (go handleRequest(r HttpRequest) say) to process requests in parallel, but use the latter to ensure that certain operations are processed sequentially. I hope this is answering your question!
You can use model #1 with WaitGroups when you have goroutines for which you need to account for and are bothered only about their exit and as such otherwise don't need to manage etc.
You can use model #2 when you need explicit management / control / communication. Channel communication is NOT free - sending and receiving routines need synchronization/channels need locking when values are sent, lot of things will have to happen under the hood.
Unless the need be, definitely option #1 is the way to go. See what's the simplest possible solution for your problem - I know it's easy to preach, but simplicity may take some time to come by.
In short, from that what i know, 2 pattern you mentioned above is not something to really compare which one to use or which one is better. Both of them just have different use case with different necessity.
From what i know, it is not about
plain go func and for loop in go func
It is more to different usage.
Before answering your question, i like to try give short explanation about two pattern you mentioned.
The first pattern is a very basic go statement usage. Which just will execute function outside its main thread. As basic usage of concurrency in go, this pattern design doesn't have a way to get data from executed function with go statement. Can't be from main() thread or any other function. In order to communicate with any other function / thread its needs channel. You already mention one pattern form several go with channel pattern available.
Just like what i mentioned earlier, this second pattern is just one of several go with channel pattern in Golang in usage with go statement. Actually this one is quite complex pattern which main usage is for selecting from multiple channels and will do further things with those channels. I will give some slight explanation about this pattern as folow:
The for loop there has no conditional statement which will work similarly like while loop at any other language like C or Java. It is mean an endless loop.
Since it is endless loop, it is need a condition which usually check from the available channels to check. For example, something like when a channel is closed it will be end.
Regarding select and case statement, if two or more communication cases happen to be ready at the same time, one will be selected at random
Even you need to communicate between concurrent/asynchronous functions running, i guess you not need it in general. Generally there is more simple pattern to communicate the threads by using channel.
In summary to answer your questions:
Which method is better to do asynchronous task is really depend on your necessity. There are several pattern which not limited to you have mentioned above. If you need just do execute function asynchronously first pattern will be fine otherwise you need one from channel pattern way available. But again, not limited to 2nd pattern you mentioned above
Both pattern you mentioned looks as common practices for me. But i guess usually we often need at least a channel in order to communicate an asynchronous task with main() thread or any other thread. And the pattern it self really depend on how you will communicate (send/receive) the data/values sources (Database, slices variables etc.) and more other aspect. I suggest you learn more about the usage of channel there are lot patterns to do with that. I suggest to check this first https://gobyexample.com/goroutines. Start from there you see at the bottom of page the "Next Example" which will getting deeper about go concurrency things.
As addition:
go statement is simple, the complex things is about the usage with channel. Here is i make list you better to learn in order to have better understanding about concurrency communication.
goroutine
Channel direction ( Send / Receive / unidirectional )
Channel concept / behavior which is communicating sequential
processes (CSP) . It is some kind about "block" and "proceed" behavior of send/receive behavior.
Buffered channel
Unbuffered channel
And more about channel :)
Hope this helps you or some one to start with goroutine and channel to works with concurrency in Golang. Please feel free if some one like to give corrections to my answer or ask further explanation about it. Thank you.
My Golang code gets different records from the database using goroutines, and increments the value in a determinated field in the record.
I can avoid the race condition If I use Mutex or Channels, but I have a bottleneck because every access to the database waits until the previous access is done.
I think I should do something like one Mutex for every different record, instead one only Mutex for all.
How could I do it?
Thanks for any reply.
In the comments you said you are using Couchbase. If the record you wish to update consists of only an integer, you can use the built in atomic increment functionality with Bucket.Incr.
If the value is part of a larger document, you can use the database's "Check and Set" functionality. In essence, you want to create a loop that does the following:
Retrieve the record to be updated along with its CAS value using Bucket.Gets
Modify the document returned by (1) as needed.
Store the modified document using Bucket.Cas, passing the CAS value retrieved in (1).
If (4) succeeds, break out of the loop. Otherwise, start again at (1).
Note that it is important to retrieve the document fresh each time in the loop. If the update fails due to an incorrect CAS value, it means the document was updated between the read and the write.
If the database has a way of handling that (i.e. an atomic counter built in) you should use that.
If not, or if you want to do this in go, you can use buffered channels. Inserting to a buffered channel is not blocking unless the buffer is full.
then, to handle the increments one at a time, in a goroutine you could have something like
for{
select{
value, _ := <-CounterChan
incrementCounterBy(value)
}
}