A channel can potentially be used by multiple Go routines. Is getting the length of the channel by using len(channel) from some of the Go routines thread-safe?
It kind of depends on your use case
it is indeed safe to call, but the result cannot be trusted when other goroutines are sending / taking from that channel
Related
Hi guys im passing from Python3 to Go so im trying to rewrite a lib ive created to get a better performance.
im facing a problem due to the fact that im noob in Golang XD, im using a limited API to download hundreds of jsons and i want to use as less as possible requests when i can.
so while downloading those jsons some of the URLs used are duplicated and the first idea i got is passing a map[stringLink]*myJsonReceived between my downloading functions ( goroutines ) and each goroutine before downloading checks if the link is already being processed by another one, so instead of requesting it again and waste Bandwidth + API Calls it should just wait for the Other goroutine to finish downloading it and get it from the dictionary.
I have few options :
1) the goroutine have to check if the link is within the map if so,it checks every 0.05s if the Pointer within the dictionary is still nil or contains the json. ( probably the badest way but it works )
2) change the map passed between goroutines to (map[stringlink]chan myjson) its probably the most efficient way but i have no idea how to send a single message to a channel and receive it by multiple awaiting Goroutines.
3) i can use the Option (2) by adding a counter to the struct and each time a goroutine founds that the url is already requested, it just add +1 to the counter and await the response from the channel,when the downloading goroutine completes it will send X messages to the channel. but this way will make me add too much LOCKs to the map which is a waste of performance.
Note: i need the map at the end of all functions execution to save the downloaded Jsons into my database to never download them again.
Thank you all in advance for your help.
What I would to to solve your task is I would use a goroutine pool for this. There would be a producer which sends URLs on a channel, and the worker goroutines would range over this channel to receive URLs to handle (fetch). Once a URL is "done", the same worker goroutine could also save it into database, or deliver the result on a result channel for a "collector" goroutine which could done the save sequentially should it be a requirement.
This construction by design makes sure every URL sent on the channel is received by only one worker goroutine, so you do not need any other synchronization (which you would need in case of using a shared map). For more about channels, see What are golang channels used for?
Go favors communication between goroutines (channels) over shared variables. Quoting from Effective Go: Share by communicating:
Do not communicate by sharing memory; instead, share memory by communicating.
For an example how you can create worker pools, see Is this an idiomatic worker thread pool in Go?
I was thinking what would happen if multiple goroutines are executing select over a set of channels where one/some of them are shared amongst them and while all of them are waiting, the shared channel becomes available.
Will runtime handle this case and allow only one goroutine to access the channel and do the read/write?
The comments above all answer it. Also you can write some code and see for yourself. Something on these lines https://play.golang.org/p/4ZQLwO9wvw
The question is in the title. Let's say I have several goroutines (more than 100) all of which eventually send data to one chan (name it mychan := make(chan int)) One another goroutine does <- mychan in an endless for loop Is it okay or the chan can happen to lose some data? Should I use buffered chan instead? Or perhaps I am to create a chan and a "demon" goroutine that will extract message for each worker goroutine?
If something has been successfully sent into the channel then no, it can't be lost in correctly working environment (I mean if you're tampering with your memory or you have bit flips due to cosmic rays then don't expect anything of course).
Message is successfully sent when ch <- x returns. Otherwise, if it panics, it's not really being sent and if you don't recover than you could claim it's lost (however, it would be lost due to application logic). Panic can happen if channel is closed or, say, you're out of memory.
Similarly if sender is putting into the channel in non-blocking mode (by using select), you should have a sufficient buffer in your channel, because messages can be "lost" (although somehow intentionally). For example signal.Notify is working this way:
Package signal will not block sending to c: the caller must ensure that c has sufficient buffer space to keep up with the expected signal rate.
No, they can't be lost.
While the language spec does not in any way impose any particular implementation on channels, you can think of them as semaphores protecting either a single value (for the single message) or an array/list of them (for buffered channels).
The semantics are then enforced in such a way that as soon as a goroutine wants to send a message to a channel, it tries to acquire a free data slot using that semaphore, and then either succeeds at sending—there's a free slot for its message—or blocks—when there isn't. As soon as such a slot appears—someone has received an existing message—the sending succeeds and the sending goroutine gets unblocked.
This is a simplified explanation. In other words, channels in Go is not like message queues which usually are happy with losing messages.
On a side note, I'm not really sure what happens if the receiver panics in some specific state when it's about to receive your message. In other words, I'm not sure whether Go guarantees that the message is either sent or not in the presence of a receiver panicking in an unfortunate moment.
Oh, and there's that grey area of the main goroutine exiting (that one running the main.main() function): the spec states clear than the main goroutine does not wait for any other goroutines to complete when it exits. So unless you somehow arrange for the synchronized controlled shutdown of all your spawned goroutines, I believe they may lose messages. On the other hand, in this case the world is collapsing anyway…
Message can not be lost. It can be not sent.Order of goroutines execution not defined. So your endless for loop can receive from only one worker all time, and even can sleep if it isn't in main thread. To be sure your queue works in regular fashion you better explicitly in 'main' receive messages for each worker.
I have a pipeline with goroutines connected by channels so that each goroutine will trigger another one until all have run. Put even simpler, imagine two goroutines A and B so that when A is done it should tell B it can run.
It's working fine and I have tried a few variants as I have learnt more about pipelines in Go.
Currently I have a signalling channel
ch := make(chan struct{})
go A(ch)
go B(ch)
...
that B blocks on
func B(ch <-chan struct{}) {
<-ch
...
and A closes when done
func A(ch chan struct{}) {
defer close(ch)
...
}
This works fine and I also tried, instead of closing, sending an empty struct struct{} in A().
Is there any difference between closing the channel or sending the empty struct? Is either way cheaper / faster / better?
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct? Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
I'm well aware of premature optimization. This is only to understand things, not to optimize anything.
Maybe there's an idiomatic Go way to do this even?
Thanks for any clarification on this!
closing a channel indicates that there will be no more sends on that channel. It's usually preferable, since you would get a panic after that point in the case of an inadvertent send or close (programming error). A close also can signal multiple receivers that there are no more messages, which you can't as easily coordinate by just sending sentinel values.
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct?
There's no guarantee that it does take any extra memory in an unbuffered channel (it's completely an implementation detail). The send blocks until the receive can proceed.
Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
There's no optimization here, close is simply another type of message that can be sent to a channel.
Each construct has a clear meaning, and you should use the appropriate one.
Send a sentinel value if you need to signal one receiver, and keep the channel open to send more values.
Close the channel if this is the final message, possibly signal multiple receivers, and it would be an error to send or close again.
You can receive from closed channel by multiple goroutines and they will never block. It's a main advantage. It's one_to_many pattern.
finish := make(chan struct{}) can be used in many_to_one pattern when many concurrent runners want to report things done, and outsider will not panic.
It's not about memory consumption.
I am trying to write a queue and I'd need to "grow" my buffered chans, is there a way to do that without having to create a new one and moving the elements to the new one?
It is not possible with standard channels. However by using an intermediate goroutine with a few tricks you can make something that's effectively equivalent. It will, however, be somewhat slower than a native channel. This is implemented as the ResizableChannel in the channels package (disclaimer: I wrote it).
godoc: https://godoc.org/github.com/eapache/channels#ResizableChannel
github: https://github.com/eapache/channels/
Why would you want to grow the chan size? Are you looking to have a chan where you can keep writing regardless whether there are readers or not?
If so, you should use a goroutine which will own the queue and two chans (read chan and a write chan). The goroutine will keep a slice of items internaly with all the written items (received via write chan) and it will keep attempting to write to the read chan which will block till there are readers reading from it.
hope this helps