I was thinking what would happen if multiple goroutines are executing select over a set of channels where one/some of them are shared amongst them and while all of them are waiting, the shared channel becomes available.
Will runtime handle this case and allow only one goroutine to access the channel and do the read/write?
The comments above all answer it. Also you can write some code and see for yourself. Something on these lines https://play.golang.org/p/4ZQLwO9wvw
Related
I need several functions to have the same channel as a parameter and take the same data, simultaneously.
Each of these functions has an independent task from each other, but they start from the same value.
For example, given a slice of integers, one function calculates the sum of its values and another calculates the average, at the same time. They would be goroutines.
One solution would be to create multiple channels from one value, but I want to avoid that. I might have to add or remove functions and for this, I would have to add or remove channels.
I think I understand that the Fan Out pattern could be an option, but I can't quite understand its implementation.
The question is against the rules of SO—as it does not present any concrete problem to be helped with but rather requests a tutoring session.
Anyway, two pointers for further research: basically—given the property of channel that each receive consumes a value sent to it, so it's impossible to read a once sent value multiple times,—such problems have two approaches to their solutions.
The first approach, which is what called a "fan-out", is to have all the consumers have a "personal" dedicated channel, copy the value to be broadcast as many times as there are consumers and send each copy to each of those dedicated channels.
The ostensibly most natural way to implement this is to have a single channel to which the producer sends its units of work—not caring how much consumers are to read them—and then have a dedicated goroutine receive those units of work, copy each of them and send the copies out to the dedicated channels of the consumers.
The second approach is to go lower level and implement basically the same scheme using stuff from the sync package.
One can think of the following scheme:
Have a custom struct type which has a sync.Mutex protecting the type's state.
Have a field which keeps the value multiple consumers have to read.
Have a counter in that type.
Have a sync.Cond in that type as well.
Have a channel with capacity there 1 as well.
Communicating a new value to the consumers looks like this:
Lock the mutex.
Verify the counter is 0, panic otherwise.
Write the new value into the respective field.
Set the counter to the number of consumers.
Unlock the mutex.
Pulse the sync.Cond.
The consumers are supposed to sleep in a wait call on that sync.Cond.
Once the sender pulses it, the goroutines running the code of consumers get woken up and try to read the value.
Reading of the value rolls like this:
Lock the mutex.
Verify the counter is greater than zero, panic otherwise.
Read the value.
Decrement the counter by one.
If the counter becomes 0, send on that special channel.
Unlock the mutex.
The channel is needed to communicate to the sender that all the consumers are done with their reads: before attempting to send the new value the consumer has to read from that channel.
As you can probably see, the second approach is way more involved and hard to get right, so I'd recommend to go with the first one.
I would also note that you seem to lack certain background knowledge on how to go around implementing concurrently running and communicating tasks.
I hereby recommend reading The Book and at least these chapters of The Blog:
Go Concurrency Patterns: Pipelines and cancellation.
Go Concurrency Patterns: Timing out, moving on
Advanced Go Concurrency Patterns
I'm a beginner at golang. Looking at all golang tutorials, it looks you should create goroutines for everything. Coming from something like libuv in C where you can define callbacks for socket read/write on a single thread, is the right way to achieve that in golang to create nested goroutines for any IO tasks needed?
As an example, take something like nginx where a single thread will handle multiple connections. To do something like that in golang, we would need a goroutine for every connection?
Go stands out in the area of tools to write networked services specifically because of the fact it has I/O-awareness integrated right into the runtime scheduler powering any running GO program.
The basic idea is roughly like this: a goroutine performs normal, sequential, callback-free operations on sockets — that is, plain reads and plain writes, — and as soon as the next I/O operation would block (yes, the relevant syscall on a Unix-like kernel returns EWOULDBLOCK), the goroutine is suspended, its socket is handed out into a component of the runtime called "netpoller", which is implemented using the platform-native socket I/O multiplexor such as epoll, kqueue or IOCP, and the OS thread the goroutine was running on is handed off to another goroutine which wants to run. As soon as the netpoller signals the I/O on the socket caused the goroutine to suspend can proceed, the scheduler queues that goroutine for execution and then it contnues to run exactly where it left off.
Because of this, the usual model employed when writing networking services in Go is to have one goroutine per socket. When you're writing plain TCP server, you should create a goroutine yourself (and hand it the socket returned by the listener once it accepted a client's connection).
net/http.Server has this behaviour built-in as it creates a goroutine to serve each incoming client request (actually, for HTTP/1.x, two or even three goroutines are created per connection, but it's invisible to HTTP request handlers).
Now, we've just covered the basics. Of course, there might exist legitimate reasons to have extra goroutines to handle tasks needed to be carried out to complete a request, and that's what #Volker referred to.
More info:
"What color is your function?" — a classical essay dealing with I/O multiplexing implemented as a library vs it being implemented in the core.
"Go's work-stealing scheduler"; also see this and this and this design doc.
State threads library which implements the approach quite similar to that of Go, just on much lower level. Its documentation is quite insightful on the approach implemented in Go.
libtask is a much more recent stab at
the same problem, by one of Go's creators.
Hi guys im passing from Python3 to Go so im trying to rewrite a lib ive created to get a better performance.
im facing a problem due to the fact that im noob in Golang XD, im using a limited API to download hundreds of jsons and i want to use as less as possible requests when i can.
so while downloading those jsons some of the URLs used are duplicated and the first idea i got is passing a map[stringLink]*myJsonReceived between my downloading functions ( goroutines ) and each goroutine before downloading checks if the link is already being processed by another one, so instead of requesting it again and waste Bandwidth + API Calls it should just wait for the Other goroutine to finish downloading it and get it from the dictionary.
I have few options :
1) the goroutine have to check if the link is within the map if so,it checks every 0.05s if the Pointer within the dictionary is still nil or contains the json. ( probably the badest way but it works )
2) change the map passed between goroutines to (map[stringlink]chan myjson) its probably the most efficient way but i have no idea how to send a single message to a channel and receive it by multiple awaiting Goroutines.
3) i can use the Option (2) by adding a counter to the struct and each time a goroutine founds that the url is already requested, it just add +1 to the counter and await the response from the channel,when the downloading goroutine completes it will send X messages to the channel. but this way will make me add too much LOCKs to the map which is a waste of performance.
Note: i need the map at the end of all functions execution to save the downloaded Jsons into my database to never download them again.
Thank you all in advance for your help.
What I would to to solve your task is I would use a goroutine pool for this. There would be a producer which sends URLs on a channel, and the worker goroutines would range over this channel to receive URLs to handle (fetch). Once a URL is "done", the same worker goroutine could also save it into database, or deliver the result on a result channel for a "collector" goroutine which could done the save sequentially should it be a requirement.
This construction by design makes sure every URL sent on the channel is received by only one worker goroutine, so you do not need any other synchronization (which you would need in case of using a shared map). For more about channels, see What are golang channels used for?
Go favors communication between goroutines (channels) over shared variables. Quoting from Effective Go: Share by communicating:
Do not communicate by sharing memory; instead, share memory by communicating.
For an example how you can create worker pools, see Is this an idiomatic worker thread pool in Go?
Due to Go's philosophy a channel should be closed by the sender only. When a channel is bidirectional where should it be closed?
The question is a little bit hard to interpret since go does not have bidirectional channels. Data flows only in a single direction - from a writer to a reader.
What you can have in Go is multiple readers or writers on a channel. Whether this makes sense depends a little bit on the context. If you have multiple writers you would need some kind of synchronization for the close operation, e.g. a mutex. However you would then also need to lock this before each write operation in order to ensure that you don't write on a closed channel. If you don't really need the information that the channel was closed on receiver side you could also simply omit the close, as the garbage collector will also collect unclosed channels just fine.
Given that I have an slice of structs of type User
Users := make([]User)
I am listening for TCP connections, and when a user connects, I'm adding an new user to this slice.
The way I've done this, is by setting up a NewUsers channel
NewUsers := make(chan User)
Upon new TCP connects, a User gets sent to this channel, and a central function waits for a User to arrive to add it to the Users slice.
But now I would like multiple subsystems (packages/functions) to use this list of Users. One function might simply want to receive a list of users, while a different function might want to broadcast messages to every user, or just users matching a certain condition.
How do multiple functions (which are possibly executed from different goroutines) safely access the list of users. I see two possible ways:
Every subsystem that needs access to this list needs their own AddUser channel and maintain their own slice of users and something needs to broadcast new users to every one of these channels.
Block access with a Mutex
Option 1 seems very convoluted and would generate a fair bit of duplication, but my understanding that Mutexes are best to be avoided if you try to stick to the "Share Memory By Communicating" mantra.
The idiomatic Go way to share data between concurrent activities is summed up in this:
Do not communicate by sharing memory; instead, share memory by
communicating.
Andrew Gerrand blogged about this, for example.
It need not be overly-complex; you can think of designing internal microservices, expressed using goroutines with channels.
In your case, this probably means designing a service element to contain the master copy of the list of users.
The main advantages of the Go/CSP strategy are that
concurrency is a design choice, along with your other aspects of design
only local knowledge is needed to understand the concurrent behaviour: this arises because a goroutine can itself consist of internal goroutines, and this applies all the way down if needed. Understanding the external behaviour of the higher-level goroutines depends only on its interfaces, not on the hidden internals.
But...
There are times when a safely shared data structure (protected by mutexes) will be sufficient now and always. It might then be argued that the extra complexity of goroutines and channels is a non-requirement.
A safely shared list data structure is something you will find several people have provided as open-source APIs. (I have one myself - see the built-ins in runtemplate).
The mutex approach is the best, safest and most manageable approach to that problem and is the fastest.
Channels are complex beasts on the inside and are much slower than a rwmutex-guarded map/slice.