Idiomatic Golang goroutines - go

In Go, if we have a type with a method that starts some looped mechanism (polling A and doing B forever) is it best to express this as:
// Run does stuff, you probably want to run this as a goroutine
func (t Type) Run() {
// Do long-running stuff
}
and document that this probably wants to be launched as a goroutine (and let the caller deal with that)
Or to hide this from the caller:
// Run does stuff concurrently
func (t Type) Run() {
go DoRunStuff()
}
I'm new to Go and unsure if convention says let the caller prefix with 'go' or do it for them when the code is designed to run async.
My current view is that we should document and give the caller a choice. My thinking is that in Go the concurrency isn't actually part of the exposed interface, but a property of using it. Is this right?

I had your opinion on this until I started writing an adapter for a web service that I want to make concurrent. I have a go routine that must be started to parse results that are returned to the channel from the web calls. There is absolutely no case in which this API would work without using it as a go routine.
I then began to look at packages like net/http. There is mandatory concurrency within that package. It is documented at the interface level that it should be able to be used concurrently, however the default implementations automatically use go routines.
Because Go's standard library commonly fires of go routines within its own packages, I think that if your package or API warrants it, you can handle them on your own.

My current view is that we should document and give the caller a choice.
I tend to agree with you.
Since Go makes it so easy to run code concurrently, you should try to avoid concurrency in your API (which forces clients to use it concurrently). Instead, create a synchronous API, and then clients have the option to run it synchronously or concurrently.
This was discussed in a talk a couple years ago: Twelve Go Best Practices
Slide 26, in particular, shows code more like your first example.
I would view the net/http package as an exception because in this case, the concurrency is almost mandatory. If the package didn't use concurrency internally, the client code would almost certainly have to. For example, http.Client doesn't (to my knowledge) start any goroutines. It is only the server that does so.
In most cases, it's going to be one line of the code for the caller either way:
go Run() or StartGoroutine()
The synchronous API is no harder to use concurrently and gives the caller more options.

There is no 'right' answer because circumstances differ.
Obviously there are cases where an API might contain utilities, simple algorithms, data collections etc that would look odd if packaged up as goroutines.
Conversely, there are cases where it is natural to expect 'under-the-hood' concurrency, such as a rich IO library (http server being the obvious example).
For a more extreme case, consider you were to produce a library of plug-n-play concurrent services. Such an API consists of modules each having a well-described interface via channels. Clearly, in this case it would inevitably involve goroutines starting as part of the API.
One clue might well be the presence or absence of channels in the function parameters. But I would expect clear documentation of what to expect either way.

Related

How to terminate long running function after a timeout

So I a attempting to shut down a long running function if something takes too long, maybe is just a solution to treating the symptoms rather than cause, but in any case for my situation it didn't really worked out.
I did it like this:
func foo(abort <- chan struct{}) {
for {
select{
case <-abort:
return
default:
///long running code
}
}
}
And in separate function I have which after some time closes the passed chain, which it does, if I cut the body returns the function. However if there is some long running code, it does not affect the outcome it simply continues the work as if nothing has happened.
I am pretty new to GO, but it feels like it should work, but it does not. Is there anything I am missing. After all routers frameworks have timeout function, after which whatever is running is terminated. So maybe this is just out of curiosity, but I would really want how to od it.
your code only checks whether the channel was closed once per iteration, before executing the long running code. There's no opportunity to check the abort chan after the long running code starts, so it will run to completion.
You need to occasionally check whether to exit early in the body of the long running code, and this is more idiomatically accomplished using context.Context and WithTimeout for example: https://pkg.go.dev/context#example-WithTimeout
In your "long running code" you have to periodically check that abort channel.
The usual approach to implement that "periodically" is to split the code into chunks each of which completes in a reasonably short time frame (given that the system the process runs on is not overloaded).
After executing each such chunk you check whether the termination condition holds and then terminate execution if it is.
The idiomatic approach to perform such a check is "select with default":
select {
case <-channel:
// terminate processing
default:
}
Here, the default no-op branch is immediately taken if channel is not ready to be received from (or closed).
Some alogrithms make such chunking easier because they employ a loop where each iteration takes roughly the same time to execute.
If your algorithm is not like this, you'd have to chunk it manually; in this case, it's best to create a separate function (or a method) for each chunk.
Further points.
Consider using contexts: they provide a useful framework to solve the style of problems like the one you're solving.
What's better, the fact they can "inherit" one another allow one to easily implement two neat things:
You can combine various ways to cancel contexts: say, it's possible to create a context which is cancelled either when some timeout passes or explicitly by some other code.
They make it possible to create "cancellation trees" — when cancelling the root context propagates this signal to all the inheriting contexts — making them cancel what other goroutines are doing.
Sometimes, when people say "long-running code" they do not mean code actually crunching numbers on a CPU all that time, but rather the code which performs requests to slow entities — such as databases, HTTP servers etc, — in which case the code is not actually running but sleeping on the I/O to deliver some data to be processed.
If this is your case, note that all well-written Go packages (of course, this includes all the packages of the Go standard library which deal with networked services) accept contexts in those functions of their APIs which actually make calls to such slow entities, and this means that if you make your function to accept a context, you can (actually should) pass this context down the stack of calls where applicable — so that all the code you call can be cancelled in the same way as yours.
Further reading:
https://go.dev/blog/pipelines
https://blog.golang.org/advanced-go-concurrency-patterns

IO callbacks on goroutine

I'm a beginner at golang. Looking at all golang tutorials, it looks you should create goroutines for everything. Coming from something like libuv in C where you can define callbacks for socket read/write on a single thread, is the right way to achieve that in golang to create nested goroutines for any IO tasks needed?
As an example, take something like nginx where a single thread will handle multiple connections. To do something like that in golang, we would need a goroutine for every connection?
Go stands out in the area of tools to write networked services specifically because of the fact it has I/O-awareness integrated right into the runtime scheduler powering any running GO program.
The basic idea is roughly like this: a goroutine performs normal, sequential, callback-free operations on sockets — that is, plain reads and plain writes, — and as soon as the next I/O operation would block (yes, the relevant syscall on a Unix-like kernel returns EWOULDBLOCK), the goroutine is suspended, its socket is handed out into a component of the runtime called "netpoller", which is implemented using the platform-native socket I/O multiplexor such as epoll, kqueue or IOCP, and the OS thread the goroutine was running on is handed off to another goroutine which wants to run. As soon as the netpoller signals the I/O on the socket caused the goroutine to suspend can proceed, the scheduler queues that goroutine for execution and then it contnues to run exactly where it left off.
Because of this, the usual model employed when writing networking services in Go is to have one goroutine per socket. When you're writing plain TCP server, you should create a goroutine yourself (and hand it the socket returned by the listener once it accepted a client's connection).
net/http.Server has this behaviour built-in as it creates a goroutine to serve each incoming client request (actually, for HTTP/1.x, two or even three goroutines are created per connection, but it's invisible to HTTP request handlers).
Now, we've just covered the basics. Of course, there might exist legitimate reasons to have extra goroutines to handle tasks needed to be carried out to complete a request, and that's what #Volker referred to.
More info:
"What color is your function?" — a classical essay dealing with I/O multiplexing implemented as a library vs it being implemented in the core.
"Go's work-stealing scheduler"; also see this and this and this design doc.
State threads library which implements the approach quite similar to that of Go, just on much lower level. Its documentation is quite insightful on the approach implemented in Go.
libtask is a much more recent stab at
the same problem, by one of Go's creators.

Run async function in specific thread

I would like to run specific long-running functions (which execute database queries) on a separate thread. However, let's assume that the underlying database engine only allows one connection at a time and the connection struct isn't Sync (I think at least the latter is true for diesel).
My solution would be to have a single separate thread (as opposed to a thread pool) where all the database-work happens and which runs as long as the main thread is alive.
I think I know how I would have to do this with passing messages over channels, but that requires quite some boilerplate code (e.g. explicitly sending the function arguments over the channel etc.).
Is there a more direct way of achieving something like this with rust (and possibly tokio and the new async/await notation that is in nightly)?
I'm hoping to do something along the lines of:
let handle = spawn_thread_with_runtime(...);
let future = run_on_thread!(handle, query_function, argument1, argument2);
where query_function would be a function that immediately returns a future and does the work on the other thread.
Rust nightly and external crates / macros would be ok.
If external crates are an option, I'd consider taking a look at actix, an Actor Framework for Rust.
This will let you spawn an Actor in a separate thread that effectively owns the connection to the DB. It can then listen for messages, execute work/queries based on those messages, and return either sync results or futures.
It takes care of most of the boilerplate for message passing, spawning, etc. at a higher level.
There's also a Diesel example in the actix documentation, which sounds quite close to the use case you had in mind.

Difference between plain go func and for loop in go func

I have some question regarding difference between plain go func and for loop in go func:
Plain go Func:
func asyncTask(){
//...something
}
in order to trigger asyncTask, we can simply:
func main(){
go asyncTask()
}
make a for loop to monitor channel:
func (c *Container) asyncTask(){
go func(){
for {
select {
case <- c.someChan:
//...do something
case <-c.ctx.Done():
//...prevent leaking
}
}
}()
}
to trigger:
func (c *Container) trigger(){
c.someChan <- val
}
My questions are:
I understand second scenario most fit the case when we wish to manage async task in a queue.
But speaking for performance out of frequently triggered async task (which cannot be block), which method is better?
Is there any best practice in general to handle async task in GoLang?
In nearly any case, performance is not the thing to think about in choosing which pattern to use (both will be fine), but which usage makes sense in your specific use case. If you use pattern (1), then you are not guaranteed the sequential processing of (2). That is the essential difference between your two examples. So for an http server for example, you would use the former pattern (go handleRequest(r HttpRequest) say) to process requests in parallel, but use the latter to ensure that certain operations are processed sequentially. I hope this is answering your question!
You can use model #1 with WaitGroups when you have goroutines for which you need to account for and are bothered only about their exit and as such otherwise don't need to manage etc.
You can use model #2 when you need explicit management / control / communication. Channel communication is NOT free - sending and receiving routines need synchronization/channels need locking when values are sent, lot of things will have to happen under the hood.
Unless the need be, definitely option #1 is the way to go. See what's the simplest possible solution for your problem - I know it's easy to preach, but simplicity may take some time to come by.
In short, from that what i know, 2 pattern you mentioned above is not something to really compare which one to use or which one is better. Both of them just have different use case with different necessity.
From what i know, it is not about
plain go func and for loop in go func
It is more to different usage.
Before answering your question, i like to try give short explanation about two pattern you mentioned.
The first pattern is a very basic go statement usage. Which just will execute function outside its main thread. As basic usage of concurrency in go, this pattern design doesn't have a way to get data from executed function with go statement. Can't be from main() thread or any other function. In order to communicate with any other function / thread its needs channel. You already mention one pattern form several go with channel pattern available.
Just like what i mentioned earlier, this second pattern is just one of several go with channel pattern in Golang in usage with go statement. Actually this one is quite complex pattern which main usage is for selecting from multiple channels and will do further things with those channels. I will give some slight explanation about this pattern as folow:
The for loop there has no conditional statement which will work similarly like while loop at any other language like C or Java. It is mean an endless loop.
Since it is endless loop, it is need a condition which usually check from the available channels to check. For example, something like when a channel is closed it will be end.
Regarding select and case statement, if two or more communication cases happen to be ready at the same time, one will be selected at random
Even you need to communicate between concurrent/asynchronous functions running, i guess you not need it in general. Generally there is more simple pattern to communicate the threads by using channel.
In summary to answer your questions:
Which method is better to do asynchronous task is really depend on your necessity. There are several pattern which not limited to you have mentioned above. If you need just do execute function asynchronously first pattern will be fine otherwise you need one from channel pattern way available. But again, not limited to 2nd pattern you mentioned above
Both pattern you mentioned looks as common practices for me. But i guess usually we often need at least a channel in order to communicate an asynchronous task with main() thread or any other thread. And the pattern it self really depend on how you will communicate (send/receive) the data/values sources (Database, slices variables etc.) and more other aspect. I suggest you learn more about the usage of channel there are lot patterns to do with that. I suggest to check this first https://gobyexample.com/goroutines. Start from there you see at the bottom of page the "Next Example" which will getting deeper about go concurrency things.
As addition:
go statement is simple, the complex things is about the usage with channel. Here is i make list you better to learn in order to have better understanding about concurrency communication.
goroutine
Channel direction ( Send / Receive / unidirectional )
Channel concept / behavior which is communicating sequential
processes (CSP) . It is some kind about "block" and "proceed" behavior of send/receive behavior.
Buffered channel
Unbuffered channel
And more about channel :)
Hope this helps you or some one to start with goroutine and channel to works with concurrency in Golang. Please feel free if some one like to give corrections to my answer or ask further explanation about it. Thank you.

Understanding goroutines for web API

Just starting out with Go and hoping to create a simple Web API. I'm looking into using Gorilla mux (http://www.gorillatoolkit.org/pkg/mux) to handle web requests.
I'm not sure how to best use Go's concurrency options to handle the requests. Did I read somewhere that the main function is actually a goroutine or should I dispatch each request to a goroutine as they are received? Apologies if I'm "way off".
Assuming you're using the Go's http.ListenAndServe to serve your http requests, the documentation clearly states that each incoming connection is handled by a separate goroutine for you. http://golang.org/pkg/net/http/#Server.Serve
You would usually call ListenAndServe from your main function.
Gorilla mux is simply a package for more flexible routing of requests to your handlers than the http.DefaultServeMux. It doesn't actually handle the incoming connection or request just simply relays it to your handler.
I highly suggest you read a bit of the documentation, specifically this guide https://golang.org/doc/articles/wiki/#tmp_3 on writing web applications.
I'm providing an answer even though I voted to close for being too broad.
Anyway, none of that is really necessary. You're over thinking it. If you haven't read this it looks like a decent tutorial; http://thenewstack.io/make-a-restful-json-api-go/
You can really just set up routes like you would with most typical rest frameworks and let the webserver/framework worry about concurrency at the request handling level. You would only employ goroutines to generate the response of a request, say if you needed to aggregate data from 10 files that are all in a folder. Contrived example, but this is where you would spin off 1 goroutine per file, aggregate all the information by reading off a channel in a non-blocking select and then return the result. You can expect all points of entry to your code are called in an asynchronous, non-blocking fashion if that makes sense...

Resources