I need your help. Currently i'm trying to create a worker pool that reads continuous jobs from a channel and then produces the result to the same channel it is reading from to then do work and produce the result to the same channel. you get the idea sorta like recursion. Is there any way to make this possible?
I would really your appreciate advice on design patterns to implement this solution using go routines as a worker pool and channels to read in jobs and then from the same worker pool write the result of that job to the same channel to keep working. Thank you.
There is no reason why something can't (from Go's perspective) write back to a channel after reading it:
func Foo(c chan int) {
x := <-c
// do something to x
c <- x
}
This is weird though... And honestly would not recommend it. Normally I have seen a system composed of several channels with data being passed down and not entering cycles. Think trees instead of graphs.
Related
I have some question regarding difference between plain go func and for loop in go func:
Plain go Func:
func asyncTask(){
//...something
}
in order to trigger asyncTask, we can simply:
func main(){
go asyncTask()
}
make a for loop to monitor channel:
func (c *Container) asyncTask(){
go func(){
for {
select {
case <- c.someChan:
//...do something
case <-c.ctx.Done():
//...prevent leaking
}
}
}()
}
to trigger:
func (c *Container) trigger(){
c.someChan <- val
}
My questions are:
I understand second scenario most fit the case when we wish to manage async task in a queue.
But speaking for performance out of frequently triggered async task (which cannot be block), which method is better?
Is there any best practice in general to handle async task in GoLang?
In nearly any case, performance is not the thing to think about in choosing which pattern to use (both will be fine), but which usage makes sense in your specific use case. If you use pattern (1), then you are not guaranteed the sequential processing of (2). That is the essential difference between your two examples. So for an http server for example, you would use the former pattern (go handleRequest(r HttpRequest) say) to process requests in parallel, but use the latter to ensure that certain operations are processed sequentially. I hope this is answering your question!
You can use model #1 with WaitGroups when you have goroutines for which you need to account for and are bothered only about their exit and as such otherwise don't need to manage etc.
You can use model #2 when you need explicit management / control / communication. Channel communication is NOT free - sending and receiving routines need synchronization/channels need locking when values are sent, lot of things will have to happen under the hood.
Unless the need be, definitely option #1 is the way to go. See what's the simplest possible solution for your problem - I know it's easy to preach, but simplicity may take some time to come by.
In short, from that what i know, 2 pattern you mentioned above is not something to really compare which one to use or which one is better. Both of them just have different use case with different necessity.
From what i know, it is not about
plain go func and for loop in go func
It is more to different usage.
Before answering your question, i like to try give short explanation about two pattern you mentioned.
The first pattern is a very basic go statement usage. Which just will execute function outside its main thread. As basic usage of concurrency in go, this pattern design doesn't have a way to get data from executed function with go statement. Can't be from main() thread or any other function. In order to communicate with any other function / thread its needs channel. You already mention one pattern form several go with channel pattern available.
Just like what i mentioned earlier, this second pattern is just one of several go with channel pattern in Golang in usage with go statement. Actually this one is quite complex pattern which main usage is for selecting from multiple channels and will do further things with those channels. I will give some slight explanation about this pattern as folow:
The for loop there has no conditional statement which will work similarly like while loop at any other language like C or Java. It is mean an endless loop.
Since it is endless loop, it is need a condition which usually check from the available channels to check. For example, something like when a channel is closed it will be end.
Regarding select and case statement, if two or more communication cases happen to be ready at the same time, one will be selected at random
Even you need to communicate between concurrent/asynchronous functions running, i guess you not need it in general. Generally there is more simple pattern to communicate the threads by using channel.
In summary to answer your questions:
Which method is better to do asynchronous task is really depend on your necessity. There are several pattern which not limited to you have mentioned above. If you need just do execute function asynchronously first pattern will be fine otherwise you need one from channel pattern way available. But again, not limited to 2nd pattern you mentioned above
Both pattern you mentioned looks as common practices for me. But i guess usually we often need at least a channel in order to communicate an asynchronous task with main() thread or any other thread. And the pattern it self really depend on how you will communicate (send/receive) the data/values sources (Database, slices variables etc.) and more other aspect. I suggest you learn more about the usage of channel there are lot patterns to do with that. I suggest to check this first https://gobyexample.com/goroutines. Start from there you see at the bottom of page the "Next Example" which will getting deeper about go concurrency things.
As addition:
go statement is simple, the complex things is about the usage with channel. Here is i make list you better to learn in order to have better understanding about concurrency communication.
goroutine
Channel direction ( Send / Receive / unidirectional )
Channel concept / behavior which is communicating sequential
processes (CSP) . It is some kind about "block" and "proceed" behavior of send/receive behavior.
Buffered channel
Unbuffered channel
And more about channel :)
Hope this helps you or some one to start with goroutine and channel to works with concurrency in Golang. Please feel free if some one like to give corrections to my answer or ask further explanation about it. Thank you.
From my understanding of Go scheduler, Go scheduling algorithm is partially preemptive: goroutine switches happen when a goroutine is calling a function or blocking on I/O.
Does a goroutine switch happen when sending a message to a channel?
// goroutine A
ch <- message
// some additional code without function calls
// goroutine B
message := <- ch
In the code above, I want the code after ch <- message in A to be executed before switching to B, is this guaranteed? or does B get scheduled right after A sends a message on ch?
A's channel send can block, at which point it yields to the scheduler and you have no guarantee when A will receive control again. It might be after the code you're interested in in B. So the sample code has problems even with GOMAXPROCS=1.
Stepping back: when preemption happens is an implementation detail; it has changed in the past (there wasn't always a chance of preemption on function call) and may change in the future. In terms of the memory model, your program is incorrect if it relies on facts about when code executes that happen to be true today but aren't guaranteed. If you want to block some code in B from running until A does something, you need to figure out a way to arrange that using channels or sync primitives.
And as user JimB notes, you don't even need to consider preemption to run into problems with the sample code. A and B could be running simultaneously on different CPU cores, and the code after the receive in B could run while the code after the send in A is running.
My practical understanding of the language and runtime says that without you blocking explicitly after ch <- message and before invoking goroutine B, you have no guarantees that A will complete or run before B. I don't know how that is actually implemented but I also don't care because I accept the goroutine abstraction at face value. Don't rely on coincidental functionality in your program. Just going off your example, my recommendation would be to pass a channel into goroutine A and then block waiting to receive off it in order to serialize A and B.
I have a pipeline with goroutines connected by channels so that each goroutine will trigger another one until all have run. Put even simpler, imagine two goroutines A and B so that when A is done it should tell B it can run.
It's working fine and I have tried a few variants as I have learnt more about pipelines in Go.
Currently I have a signalling channel
ch := make(chan struct{})
go A(ch)
go B(ch)
...
that B blocks on
func B(ch <-chan struct{}) {
<-ch
...
and A closes when done
func A(ch chan struct{}) {
defer close(ch)
...
}
This works fine and I also tried, instead of closing, sending an empty struct struct{} in A().
Is there any difference between closing the channel or sending the empty struct? Is either way cheaper / faster / better?
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct? Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
I'm well aware of premature optimization. This is only to understand things, not to optimize anything.
Maybe there's an idiomatic Go way to do this even?
Thanks for any clarification on this!
closing a channel indicates that there will be no more sends on that channel. It's usually preferable, since you would get a panic after that point in the case of an inadvertent send or close (programming error). A close also can signal multiple receivers that there are no more messages, which you can't as easily coordinate by just sending sentinel values.
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct?
There's no guarantee that it does take any extra memory in an unbuffered channel (it's completely an implementation detail). The send blocks until the receive can proceed.
Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
There's no optimization here, close is simply another type of message that can be sent to a channel.
Each construct has a clear meaning, and you should use the appropriate one.
Send a sentinel value if you need to signal one receiver, and keep the channel open to send more values.
Close the channel if this is the final message, possibly signal multiple receivers, and it would be an error to send or close again.
You can receive from closed channel by multiple goroutines and they will never block. It's a main advantage. It's one_to_many pattern.
finish := make(chan struct{}) can be used in many_to_one pattern when many concurrent runners want to report things done, and outsider will not panic.
It's not about memory consumption.
I am trying to write a queue and I'd need to "grow" my buffered chans, is there a way to do that without having to create a new one and moving the elements to the new one?
It is not possible with standard channels. However by using an intermediate goroutine with a few tricks you can make something that's effectively equivalent. It will, however, be somewhat slower than a native channel. This is implemented as the ResizableChannel in the channels package (disclaimer: I wrote it).
godoc: https://godoc.org/github.com/eapache/channels#ResizableChannel
github: https://github.com/eapache/channels/
Why would you want to grow the chan size? Are you looking to have a chan where you can keep writing regardless whether there are readers or not?
If so, you should use a goroutine which will own the queue and two chans (read chan and a write chan). The goroutine will keep a slice of items internaly with all the written items (received via write chan) and it will keep attempting to write to the read chan which will block till there are readers reading from it.
hope this helps
I understand from this question "Golang - What is channel buffer size?" that if the channel is buffered it won't block.
c := make(chan int, 1)
c <- data1 // doesn't block
c <- data2 // blocks until another goroutine receives from the channel
c <- data3
c <- data4
But I don't understand whats the use of it. Suppose if I have 2 goroutines, 1st one will received data1 and 2nd one receives data2 then it will block till any subroutines gets free to process data3.
I don't understand what difference did it make ? It would have executed the same way without buffer. Can you explain a possible scenario where buffering is useful ?
A buffered channel allows the goroutine that is adding data to the buffered channel to keep running and doing things, even if the goroutines reading from the channel are starting to fall behind a little bit.
For example, you might have one goroutine that is receiving HTTP requests and you want it to be as fast as possible. However you also want it to queue up some background job, like sending an email, which could take a while. So the HTTP goroutine just parses the user's request and quickly adds the background job to the buffered channel. The other goroutines will process it when they have time. If you get a sudden surge in HTTP requests, the users will not notice any slowness in the HTTP if your buffer is big enough.
This site has a good explanation:
https://www.openmymind.net/Introduction-To-Go-Buffered-Channels/