Difference between plain go func and for loop in go func - go

I have some question regarding difference between plain go func and for loop in go func:
Plain go Func:
func asyncTask(){
//...something
}
in order to trigger asyncTask, we can simply:
func main(){
go asyncTask()
}
make a for loop to monitor channel:
func (c *Container) asyncTask(){
go func(){
for {
select {
case <- c.someChan:
//...do something
case <-c.ctx.Done():
//...prevent leaking
}
}
}()
}
to trigger:
func (c *Container) trigger(){
c.someChan <- val
}
My questions are:
I understand second scenario most fit the case when we wish to manage async task in a queue.
But speaking for performance out of frequently triggered async task (which cannot be block), which method is better?
Is there any best practice in general to handle async task in GoLang?

In nearly any case, performance is not the thing to think about in choosing which pattern to use (both will be fine), but which usage makes sense in your specific use case. If you use pattern (1), then you are not guaranteed the sequential processing of (2). That is the essential difference between your two examples. So for an http server for example, you would use the former pattern (go handleRequest(r HttpRequest) say) to process requests in parallel, but use the latter to ensure that certain operations are processed sequentially. I hope this is answering your question!

You can use model #1 with WaitGroups when you have goroutines for which you need to account for and are bothered only about their exit and as such otherwise don't need to manage etc.
You can use model #2 when you need explicit management / control / communication. Channel communication is NOT free - sending and receiving routines need synchronization/channels need locking when values are sent, lot of things will have to happen under the hood.
Unless the need be, definitely option #1 is the way to go. See what's the simplest possible solution for your problem - I know it's easy to preach, but simplicity may take some time to come by.

In short, from that what i know, 2 pattern you mentioned above is not something to really compare which one to use or which one is better. Both of them just have different use case with different necessity.
From what i know, it is not about
plain go func and for loop in go func
It is more to different usage.
Before answering your question, i like to try give short explanation about two pattern you mentioned.
The first pattern is a very basic go statement usage. Which just will execute function outside its main thread. As basic usage of concurrency in go, this pattern design doesn't have a way to get data from executed function with go statement. Can't be from main() thread or any other function. In order to communicate with any other function / thread its needs channel. You already mention one pattern form several go with channel pattern available.
Just like what i mentioned earlier, this second pattern is just one of several go with channel pattern in Golang in usage with go statement. Actually this one is quite complex pattern which main usage is for selecting from multiple channels and will do further things with those channels. I will give some slight explanation about this pattern as folow:
The for loop there has no conditional statement which will work similarly like while loop at any other language like C or Java. It is mean an endless loop.
Since it is endless loop, it is need a condition which usually check from the available channels to check. For example, something like when a channel is closed it will be end.
Regarding select and case statement, if two or more communication cases happen to be ready at the same time, one will be selected at random
Even you need to communicate between concurrent/asynchronous functions running, i guess you not need it in general. Generally there is more simple pattern to communicate the threads by using channel.
In summary to answer your questions:
Which method is better to do asynchronous task is really depend on your necessity. There are several pattern which not limited to you have mentioned above. If you need just do execute function asynchronously first pattern will be fine otherwise you need one from channel pattern way available. But again, not limited to 2nd pattern you mentioned above
Both pattern you mentioned looks as common practices for me. But i guess usually we often need at least a channel in order to communicate an asynchronous task with main() thread or any other thread. And the pattern it self really depend on how you will communicate (send/receive) the data/values sources (Database, slices variables etc.) and more other aspect. I suggest you learn more about the usage of channel there are lot patterns to do with that. I suggest to check this first https://gobyexample.com/goroutines. Start from there you see at the bottom of page the "Next Example" which will getting deeper about go concurrency things.
As addition:
go statement is simple, the complex things is about the usage with channel. Here is i make list you better to learn in order to have better understanding about concurrency communication.
goroutine
Channel direction ( Send / Receive / unidirectional )
Channel concept / behavior which is communicating sequential
processes (CSP) . It is some kind about "block" and "proceed" behavior of send/receive behavior.
Buffered channel
Unbuffered channel
And more about channel :)
Hope this helps you or some one to start with goroutine and channel to works with concurrency in Golang. Please feel free if some one like to give corrections to my answer or ask further explanation about it. Thank you.

Related

How to terminate long running function after a timeout

So I a attempting to shut down a long running function if something takes too long, maybe is just a solution to treating the symptoms rather than cause, but in any case for my situation it didn't really worked out.
I did it like this:
func foo(abort <- chan struct{}) {
for {
select{
case <-abort:
return
default:
///long running code
}
}
}
And in separate function I have which after some time closes the passed chain, which it does, if I cut the body returns the function. However if there is some long running code, it does not affect the outcome it simply continues the work as if nothing has happened.
I am pretty new to GO, but it feels like it should work, but it does not. Is there anything I am missing. After all routers frameworks have timeout function, after which whatever is running is terminated. So maybe this is just out of curiosity, but I would really want how to od it.
your code only checks whether the channel was closed once per iteration, before executing the long running code. There's no opportunity to check the abort chan after the long running code starts, so it will run to completion.
You need to occasionally check whether to exit early in the body of the long running code, and this is more idiomatically accomplished using context.Context and WithTimeout for example: https://pkg.go.dev/context#example-WithTimeout
In your "long running code" you have to periodically check that abort channel.
The usual approach to implement that "periodically" is to split the code into chunks each of which completes in a reasonably short time frame (given that the system the process runs on is not overloaded).
After executing each such chunk you check whether the termination condition holds and then terminate execution if it is.
The idiomatic approach to perform such a check is "select with default":
select {
case <-channel:
// terminate processing
default:
}
Here, the default no-op branch is immediately taken if channel is not ready to be received from (or closed).
Some alogrithms make such chunking easier because they employ a loop where each iteration takes roughly the same time to execute.
If your algorithm is not like this, you'd have to chunk it manually; in this case, it's best to create a separate function (or a method) for each chunk.
Further points.
Consider using contexts: they provide a useful framework to solve the style of problems like the one you're solving.
What's better, the fact they can "inherit" one another allow one to easily implement two neat things:
You can combine various ways to cancel contexts: say, it's possible to create a context which is cancelled either when some timeout passes or explicitly by some other code.
They make it possible to create "cancellation trees" — when cancelling the root context propagates this signal to all the inheriting contexts — making them cancel what other goroutines are doing.
Sometimes, when people say "long-running code" they do not mean code actually crunching numbers on a CPU all that time, but rather the code which performs requests to slow entities — such as databases, HTTP servers etc, — in which case the code is not actually running but sleeping on the I/O to deliver some data to be processed.
If this is your case, note that all well-written Go packages (of course, this includes all the packages of the Go standard library which deal with networked services) accept contexts in those functions of their APIs which actually make calls to such slow entities, and this means that if you make your function to accept a context, you can (actually should) pass this context down the stack of calls where applicable — so that all the code you call can be cancelled in the same way as yours.
Further reading:
https://go.dev/blog/pipelines
https://blog.golang.org/advanced-go-concurrency-patterns

Golang: using multiple tickers cases in single select blocks entire loop

I have a requirements where I need to do multiple things (irrelevant here) at some regular intervals. I achieved it using the code block mentioned below -
func (processor *Processor) process() {
defaultTicker := time.NewTicker(time.Second*2)
updateTicker := time.NewTicker(time.Second*5)
heartbeatTicker := time.NewTicker(time.Second*5)
timeoutTicker := time.NewTicker(30*time.Second)
refreshTicker := time.NewTicker(2*time.Minute)
defer func() {
logger.Info("processor for ", processor.id, " exited")
defaultTicker.Stop()
timeoutTicker.Stop()
updateTicker.Stop()
refreshTicker.Stop()
heartbeatTicker.Stop()
}()
for {
select {
case <-defaultTicker.C:
// spawn some go routines
case <-updateTicker.C:
// do something
case <-timeoutTicker.C:
// do something else
case <-refreshTicker.C:
// log
case <-heartbeatTicker.C:
// push metrics to redis
}
}
}
But I noticed that every once in a while, my for select loop gets stuck somewhere and I cannot seem to find where or why. By stuck I mean I stop receiving refresh ticker logs. But it starts working again normally in some time (5-10 mins)
I have made sure that all operations within each ticker completes within very little amount of time (~0ms, checked by putting logs).
My questions:
Is using multiple tickers in single select a good/normal practice (honestly I did not find many examples using multiple tickers online)
Anyone aware of any known issues/pitfalls where tickers can block the loop for longer duration.
Any help is appreciated. Thanks
Go does not provide any smart draining behavior for multiple channels, e.g., that older messages in one channel would get processed earlier than more recent messages in other channels. Anytime the loop enters the select statement a random channel is chosen.
Also see this answer and read the part about GOMAXPROCS=1. This could be related to your issue. The issue could also be in your logging package. Maybe the logs are just delayed.
In general, I think the issue must be in your case statements. Either you have a blocking function or some dysfunctional code. (Note: confirmed by the OP)
But to answer your questions:
1. Is using multiple tickers in single select a good/normal practice?
It is common to read from multiple channels randomly in a blocking way, one message at a time, e.g., to sort incoming data from multiple channels into a slice or map and avoid concurrent data access.
It is also common to add one or more tickers, e.g., to flush data and for logging or reporting. Usually the non-ticker code paths will do most of the work.
In your case, you use tickers that will run code paths that should block each other, which is a very specific use case but may be required in some scenarios. This uncommon, but not bad practice I think.
As the commenters suggested, you could also schedule different recurring tasks in separate goroutines.
2. Is anyone aware of any known issues/pitfalls where tickers can block the loop for longer duration?
The tickers themselves will not block the loop in any hidden way. The fastest ticker will always ensure the loop is looping at the speed of this ticker at least.
Note that the docs of time.NewTicker say:
The ticker will adjust the time interval or drop ticks to make up for
slow receivers
This just means, internally no new ticks are scheduled until you have consumed the last one from the single-element ticker channel.
In your example, the main pitfall is that any code in the case statements will block and thus delay the other cases.
If this is intended, everything is fine.
There may be other pitfalls if you have Microsecond or Nanosecond tickers where you may see some measurable runtime overhead or if you have hundreds of tickers and case blocks. But then you should have chose another scheduling pattern from the beginning.

How to implement a channel and multiple readers that read the same data at the same time?

I need several functions to have the same channel as a parameter and take the same data, simultaneously.
Each of these functions has an independent task from each other, but they start from the same value.
For example, given a slice of integers, one function calculates the sum of its values ​​and another calculates the average, at the same time. They would be goroutines.
One solution would be to create multiple channels from one value, but I want to avoid that. I might have to add or remove functions and for this, I would have to add or remove channels.
I think I understand that the Fan Out pattern could be an option, but I can't quite understand its implementation.
The question is against the rules of SO—as it does not present any concrete problem to be helped with but rather requests a tutoring session.
Anyway, two pointers for further research: basically—given the property of channel that each receive consumes a value sent to it, so it's impossible to read a once sent value multiple times,—such problems have two approaches to their solutions.
The first approach, which is what called a "fan-out", is to have all the consumers have a "personal" dedicated channel, copy the value to be broadcast as many times as there are consumers and send each copy to each of those dedicated channels.
The ostensibly most natural way to implement this is to have a single channel to which the producer sends its units of work—not caring how much consumers are to read them—and then have a dedicated goroutine receive those units of work, copy each of them and send the copies out to the dedicated channels of the consumers.
The second approach is to go lower level and implement basically the same scheme using stuff from the sync package.
One can think of the following scheme:
Have a custom struct type which has a sync.Mutex protecting the type's state.
Have a field which keeps the value multiple consumers have to read.
Have a counter in that type.
Have a sync.Cond in that type as well.
Have a channel with capacity there 1 as well.
Communicating a new value to the consumers looks like this:
Lock the mutex.
Verify the counter is 0, panic otherwise.
Write the new value into the respective field.
Set the counter to the number of consumers.
Unlock the mutex.
Pulse the sync.Cond.
The consumers are supposed to sleep in a wait call on that sync.Cond.
Once the sender pulses it, the goroutines running the code of consumers get woken up and try to read the value.
Reading of the value rolls like this:
Lock the mutex.
Verify the counter is greater than zero, panic otherwise.
Read the value.
Decrement the counter by one.
If the counter becomes 0, send on that special channel.
Unlock the mutex.
The channel is needed to communicate to the sender that all the consumers are done with their reads: before attempting to send the new value the consumer has to read from that channel.
As you can probably see, the second approach is way more involved and hard to get right, so I'd recommend to go with the first one.
I would also note that you seem to lack certain background knowledge on how to go around implementing concurrently running and communicating tasks.
I hereby recommend reading The Book and at least these chapters of The Blog:
Go Concurrency Patterns: Pipelines and cancellation.
Go Concurrency Patterns: Timing out, moving on
Advanced Go Concurrency Patterns

Run async function in specific thread

I would like to run specific long-running functions (which execute database queries) on a separate thread. However, let's assume that the underlying database engine only allows one connection at a time and the connection struct isn't Sync (I think at least the latter is true for diesel).
My solution would be to have a single separate thread (as opposed to a thread pool) where all the database-work happens and which runs as long as the main thread is alive.
I think I know how I would have to do this with passing messages over channels, but that requires quite some boilerplate code (e.g. explicitly sending the function arguments over the channel etc.).
Is there a more direct way of achieving something like this with rust (and possibly tokio and the new async/await notation that is in nightly)?
I'm hoping to do something along the lines of:
let handle = spawn_thread_with_runtime(...);
let future = run_on_thread!(handle, query_function, argument1, argument2);
where query_function would be a function that immediately returns a future and does the work on the other thread.
Rust nightly and external crates / macros would be ok.
If external crates are an option, I'd consider taking a look at actix, an Actor Framework for Rust.
This will let you spawn an Actor in a separate thread that effectively owns the connection to the DB. It can then listen for messages, execute work/queries based on those messages, and return either sync results or futures.
It takes care of most of the boilerplate for message passing, spawning, etc. at a higher level.
There's also a Diesel example in the actix documentation, which sounds quite close to the use case you had in mind.

Idiomatic Golang goroutines

In Go, if we have a type with a method that starts some looped mechanism (polling A and doing B forever) is it best to express this as:
// Run does stuff, you probably want to run this as a goroutine
func (t Type) Run() {
// Do long-running stuff
}
and document that this probably wants to be launched as a goroutine (and let the caller deal with that)
Or to hide this from the caller:
// Run does stuff concurrently
func (t Type) Run() {
go DoRunStuff()
}
I'm new to Go and unsure if convention says let the caller prefix with 'go' or do it for them when the code is designed to run async.
My current view is that we should document and give the caller a choice. My thinking is that in Go the concurrency isn't actually part of the exposed interface, but a property of using it. Is this right?
I had your opinion on this until I started writing an adapter for a web service that I want to make concurrent. I have a go routine that must be started to parse results that are returned to the channel from the web calls. There is absolutely no case in which this API would work without using it as a go routine.
I then began to look at packages like net/http. There is mandatory concurrency within that package. It is documented at the interface level that it should be able to be used concurrently, however the default implementations automatically use go routines.
Because Go's standard library commonly fires of go routines within its own packages, I think that if your package or API warrants it, you can handle them on your own.
My current view is that we should document and give the caller a choice.
I tend to agree with you.
Since Go makes it so easy to run code concurrently, you should try to avoid concurrency in your API (which forces clients to use it concurrently). Instead, create a synchronous API, and then clients have the option to run it synchronously or concurrently.
This was discussed in a talk a couple years ago: Twelve Go Best Practices
Slide 26, in particular, shows code more like your first example.
I would view the net/http package as an exception because in this case, the concurrency is almost mandatory. If the package didn't use concurrency internally, the client code would almost certainly have to. For example, http.Client doesn't (to my knowledge) start any goroutines. It is only the server that does so.
In most cases, it's going to be one line of the code for the caller either way:
go Run() or StartGoroutine()
The synchronous API is no harder to use concurrently and gives the caller more options.
There is no 'right' answer because circumstances differ.
Obviously there are cases where an API might contain utilities, simple algorithms, data collections etc that would look odd if packaged up as goroutines.
Conversely, there are cases where it is natural to expect 'under-the-hood' concurrency, such as a rich IO library (http server being the obvious example).
For a more extreme case, consider you were to produce a library of plug-n-play concurrent services. Such an API consists of modules each having a well-described interface via channels. Clearly, in this case it would inevitably involve goroutines starting as part of the API.
One clue might well be the presence or absence of channels in the function parameters. But I would expect clear documentation of what to expect either way.

Resources