There is one go routine generating data. Also are many go routines that handles http response. I want generated data to be passed to all http handler routines. All dispatched data are same.
I thought two solutions. Using channel pipeline to fan-out or using mutex and condition variable.
I concern if former way needs memory allocation to put data in channel.
What should I choose?
Your use case sounds like it benefits from channels. In general channels are preferred when communication between go routines is needed. Sounds like a classic example of a worker pool.
Mutexes are used to protect a piece of memory, so only 1 goroutine can access/modify it at a time. Often times this is the opposite of what people want, which is to parallelize execution.
A good rule of thumb is to not worry about optimization(memory allocation or not) until it actually becomes an issue. Premature optimization is a common anti-pattern.
Related
I need to share a large tree (but for simplicity we can think it as a slice of strings) across multiple goroutines (http handlers). The tree is very rarely written, and only by one goroutine, but each http handler needs to read it.
Options I envisioned:
Use a mutex: very expensive and high latency for my use case. Handlers will fight to get a lock even if 99% of the time is not needed, being a read mostly struct.
Use channels: It's hard for me to imagine how I could use channels efficiently inside an http handler: it would need a good bit of boilerplate and it would copy the tree for each invocation, which is expensive.
Use lazy pointers? At invocation the handler get a pointer to the current tree structure, new writes would happen by updating a new copy of the tree, and atomically updating the tree pointer. I should also keep the old tree available until all the running goroutines return. Seems a bit tricky to code.
A mix of the last two? I could use channels to get the latest pointer to the tree, instead of the tree itself. Still a bit hard to imagine how would I write this down.
Is there any other way I'm not seeing? Any suggestion or tip?
Naive answer:
The simplest approach is to use a RWLock as showed in the official doc.
The problem is that after Spectre and Meltdown RWLock are significantly slower than atomics:
Benchmark_RWMutex_parallel-6 66796699 17.96 ns/op
Benchmark_Atomic_parallel-6 1000000000 0.5528 ns/op
The situation becomes exponentially worse with high thread count (above 32 threads) or intel CPUs. You can find more discussion about this here.
Modern answer:
Here's an example using atomics. It can still be improved by using pointers instead of structs, but it's a very good starting point.
On one job interview I had to answer this question - which advantages do you have owing to use epoll integration/implementation in Go.
I just know what can do epoll and that complexity for any descriptors count is O(1), but have no idea why Go is better than other languages.
I found this branch https://news.ycombinator.com/item?id=15624586 and guy say that reason, maybe, that Go don't use stack switching. It's hard to understand for me. Which part of program don't use stack switching? Every goroutine has his own stack.
That's not the netpoller integration per se which makes Go strong in its field—it's rather the way that integration is done: instead of being bolted-on as a library, in Go, the netpoller is tightly integrated right into the runtime and the scheduler (which decides which goroutine to run, and when).
The coupling of super-light-weight threads of execution—goroutines—with the netpoller allows for callback-free programming. That is, once your service gets another client connected, you just hand this connection to a goroutine which merely reads the data from it (and writes its response stream to it). As soon as there's no data available when the goroutine wants to read it, the scheduler suspends the goroutine and unblocks it once the netpoller reports there's data available; the same happens when the goroutine wants to write data but the sending buffer is full.
To recap, the netpoller in Go is intertwined with the goroutine scheduler which allows goroutines transparently wait for data availability without requiring the programmer to explicitly code the event loop and callbacks or deal with "futures" and "promises" which are mere callbacks wrapped in pretty objects.
I invite you to read this classic essay which explains this stuff with much more beautiful words.
Since it is generally very little overhead in terms of memory requirement and setup/tear down cost of a Go routine. Is it relevant even to implement a thread(go routine) worker pool? When would you consider using a thread pool instead of 'spawning' a go routine per request?
Spawning and keeping lots of goroutines in golang is cheap but it's not free.
Also you should remember that goroutine themselves may be very cheap, but at the same time a lot of memory can be allocated inside of goroutine code. So you may want to limit number of concurrently running goroutines.
You may use semaphore to limit resources.
Another approach (more idiomatic for go) is to use executions pipelines with worker pools. This pattern is very well described in golang blog.
Yes, it's relevant. db/sql uses pool of connections to database, because of establishing new connection take time.
If I am using channels properly, should I need to use mutexes to protect against concurrent access?
You don't need mutex if you use channels correctly. In some cases a solution with mutex might be simpler though.
Just make sure the variable(s) holding the channel values are properly initialized before multiple goroutines try to access the channel variables. Once this is done, accessing the channels (e.g. sending values to or receiving values from them) is safe by design.
Supporting documents with references (emphases added by me):
Spec: Channel types:
A single channel may be used in send statements, receive operations, and calls to the built-in functions cap and len by any number of goroutines without further synchronization. Channels act as first-in-first-out queues. For example, if one goroutine sends values on a channel and a second goroutine receives them, the values are received in the order sent.
Effective Go: Concurrency: Share by communicating
Concurrent programming in many environments is made difficult by the subtleties required to implement correct access to shared variables. Go encourages a different approach in which shared values are passed around on channels and, in fact, never actively shared by separate threads of execution. Only one goroutine has access to the value at any given time. Data races cannot occur, by design. To encourage this way of thinking we have reduced it to a slogan:
Do not communicate by sharing memory; instead, share memory by communicating.
This approach can be taken too far. Reference counts may be best done by putting a mutex around an integer variable, for instance. But as a high-level approach, using channels to control access makes it easier to write clear, correct programs.
This article is also very helpful: The Go Memory Model
Also quoting from the package doc of sync:
Package sync provides basic synchronization primitives such as mutual exclusion locks. Other than the Once and WaitGroup types, most are intended for use by low-level library routines. Higher-level synchronization is better done via channels and communication.
What are the uses cases for buffered channels ? If i want multiple parallel actions i could just use the default, synchronous channel eq.
package main
import "fmt"
import "time"
func longLastingProcess(c chan string) {
time.Sleep(2000 * time.Millisecond)
c <- "tadaa"
}
func main() {
c := make(chan string)
go longLastingProcess(c)
go longLastingProcess(c)
go longLastingProcess(c)
fmt.Println(<- c)
}
What would be the practical cases for increasing the buffer size ?
To give a single, slightly-more-concrete use case:
Suppose you want your channel to represent a task queue, so that a task scheduler can send jobs into the queue, and a worker thread can consume a job by receiving it in the channel.
Suppose further that, though in general you expect each job to be handled in a timely fashion, it takes longer for a worker to complete a task than it does for the scheduler to schedule it.
Having a buffer allows the scheduler to deposit jobs in the queue and still remain responsive to user input (or network traffic, or whatever) because it does not have to sleep until the worker is ready each time it schedules a task. Instead, it goes about its business, and trusts the workers to catch up during a quieter period.
If you want an EVEN MORE CONCRETE example dealing with a specific piece of software then I'll see what I can do, but I hope this meets your needs.
Generally, buffering in channels is beneficial for performance reasons.
If a program is designed using an event-flow or data-flow approach, channels provide the means for the events to pass between one process and another (I use the term process in the same sense as in Tony Hoare's Communicating Sequential Processes (CSP), ie. effectively synonymous with the goroutine).
There are times when a program needs its components to remain in lock-step synchrony. In this case, unbuffered channels are required.
Otherwise, it is typically beneficial to add buffering to the channels. This should be seen as an optimisation step (deadlock may still be possible if not designed out).
There are novel throttle structures made possible by using channels with small buffers (example).
There are special overwriting or lossy forms of channels used in occam and jcsp for fixing the special case of a cycle (or loop) of processes that would otherwise probably deadlock. This is also possible in Go by writing an overwriting goroutine buffer (example).
You should never add buffering merely to fix a deadlock. If your program deadlocks, it's far easier to fix by starting with zero buffering and think through the dependencies. Then add buffering when you know it won't deadlock.
You can construct goroutines compositionally - that is, a goroutine may itself contain goroutines. This is a feature of CSP and benefits scalability greatly. The internal channels between a group of goroutines are not of interest when designing the external use of the group as a self-contained component. This principle can be applied repeatedly at increasingly-larger scales.
If the receiver of the channel is always slower than sender a buffer of any size will eventually be consumed. That will leave you with a channel that pauses your go routine as often as a unbuffered channel so you might as well use an unbuffered channel.
If the receiver is typically faster than the sender except for an occasional burst a buffered channel may be helpful and the buffer should be set to the size of the typical burst which you can arrive at by measurement at runtime.
As an alternative to a buffered channel it may better to just send an array or a struct containing an array over the channel to deal with bursts/batches.
Buffered channels are non-blocking for the sender as long as there's still room. This can increase responsiveness and throughput.
Sending several items on one buffered channel makes sure they are processed in the order in which they are sent.
From Effective Go (with example): "A buffered channel can be used like a semaphore, for instance to limit throughput."
In general, there are many use-cases and patterns of channel usage, so this is not an exhausting answer.
It's a hard question b/c the program is incorrect: It exits after receiving a signal from one goroutine, but three were started. Buffering the channel makes it no different.
EDIT: For example, here is a bit of general discussion about channel buffers. And some exercise. And a book chapter about the same.