I have a buffered channel that are read by multiple (4 in this example) go routines.
queue := make(chan string, 10000) // a large buffered channel
Each go routine checks the number of elements available in the channel and process them all.
for i :=0; i< 4; i++{ // spun 4 go routines
go func() {
for {
for elem := range queue {
// do something with the elem from the channel
}
}
}
}
Will multiple go routines collide on the reads? In other words, could different go routine grab the same elem in the channel, or while one go routine is reading the buffer, the other go routines already read and processed some of the elements? How to block other go routines from reading while one go routine is reading?
Simple answer: no. Elements placed on a Go channel can only be read once, regardless of how many goroutines are trying to read off the channel at the same time, and that applies regardless of whether the channel is buffered or not. There's no possibility that an element will be read by two different goroutines unless that element was sent to the channel more than once. The only thing that buffering does, with regards to channel semantics, is remove the necessity for the read and write to occur synchronously.
In other words, could different go routine grab the same elem in the channel, or while one go routine is reading the buffer, the other go routines already read and processed some of the elements?
Nope...
I believe a misunderstanding is in difference between non-blocking and thread safe concepts.
Non blocking (buffered) channels
Sends to a buffered channel block only when the buffer is full.
Buffered channels just like it said have buffers to store some amount of items. It allows reading goroutine to read without await for writing goroutine put an item to a channel on condition something already written to a channel. If a channel unbuffered it can contain just single item therefore it requires to block channel for writing before a written item be withdrawn. "Blocking/non-blocking" concept doesn't related to "thread safe" concept and non-blocking doesn't mean not thread safe.
Thread safety of Go channels
Go channels are thread safe in all available ways of use. Channel is a reference type so once allocated with make channel could be passed by value because it has implicit pointer to a single memory slot. Obviously contained in a channel item never be copied and couldn't be read twice.
Related
I am going through a tutorial on building web servers using go.
The author, instead of directly using the http.ListenAndServe() method, he creates the http.Server struct.
He then proceeds by:
creating a buffered channel for listening for errors
serverErrors := make(chan errors, 1)
spawning the http listening goroutine that binds to that channel
go func(){
fmt.Println("starting...")
serverErrors <- api.ListenAndServe()
}()
The reason behind using a buffered channel is according to the instructor
so that the goroutine can exit if we do not collect this error
There is indeed below in a program a select block where the errors coming from this channel are being collected.
Can anyone pls help me understand how the goroutine gets to exit if we don't collect the error?
What would be the practical difference had we used an unbuffered channel?
Short answer:
For any channel (buffered or not), channel reads block if nothing is written to the channel.
For non-buffered channels, channel writes will block if no one is listening.
It is a common technique with error-channels (since only one item will ever be written to the channel), to make it a buffered channel of size 1. It ensures the write will happen without blocking - and the writer goroutine can continue on its way and return.
Therefore the service does not relying on the client caller reading from the error channel to perform its cleanup.
Note: to reclaim a channel re GC, it only has to go out of scope - it does not need to be fully drained. Nor does it need to be closed. Once it goes out of scope from the both ends, it will be GC'ed.
If you refer the code for ListenAndServe(), you'll notice the following comments on how it works. Quoting from there itself:
// ListenAndServe always returns a non-nil error. After Shutdown or Close,
// the returned error is ErrServerClosed.
Also,
// When Shutdown is called, Serve, ListenAndServe, and
// ListenAndServeTLS immediately return ErrServerClosed. Make sure the
// program doesn't exit and waits instead for Shutdown to return.
Your select block is waiting for Shutdown (error) considering that you're gracefully handling the server's shutdown and doesn't let the goroutine exit before it gracefully closes.
In the case of func (srv *Server) Close() (eg. Most use defer srv.Close(), right?):
// Close immediately closes all active net.Listeners and any
// connections in state StateNew, StateActive, or StateIdle. For a
// Close returns any error returned from closing the Server's
// underlying Listener(s).
// graceful shutdown, use Shutdown.
So, the same explanation as above carries of using the select block.
Now, let's categorize channels as buffered and unbuffered, and if we do care about the guarantee of delivery of the signal (communication with the channel), then unbuffered one ensures it. Whereas, if the buffered channel (size = 1) which is in your case, then it ensures delivery but might be delayed.
Let's elaborate unbuffered channels:
A send operation on an unbuffered channel blocks the sending goroutine until another
goroutine executes a corresponding receive on that same channel, at which point the value
is transmitted and both goroutines may continue
Conversely, if received on the channel earlier (<-chan) than send operation, then the
receiving goroutine is blocked until the corresponding send operation occurs on the
same channel on another goroutine.
Aforementioned points for unbuffered channels indicate synchronous nature.
Remember, func main() is also a goroutine.
Let's elaborate buffered channels:
A send operation on a buffered channel pushes an element at the back of the queue,
and a receive operation pops an element from the front of the queue.
1. If the channel is full, the send operation blocks its goroutine until space is made available by another goroutine's receive.
2. If the channel is empty, a receive operation blocks until a value is sent by another goroutine.
So in your case size of the channel is 1. The other sender goroutine can send in a non-blocking manner as the receiver channel of the other goroutine dequeues it as soon as it receives. But, if you remember, I mentioned delayed delivery for the channel with size 1 as we don't how much time it'll take for the receiver channel goroutine to return.
Hence, to block the sender goroutine, select block is used. And from the referenced code's documentation, you can see
// Make sure the program doesn't exit and waits instead for Shutdown to return.
Also, for more clarity, you can refer: Behaviour of channels
The author explains it with pure clarity.
The question is in the title. Let's say I have several goroutines (more than 100) all of which eventually send data to one chan (name it mychan := make(chan int)) One another goroutine does <- mychan in an endless for loop Is it okay or the chan can happen to lose some data? Should I use buffered chan instead? Or perhaps I am to create a chan and a "demon" goroutine that will extract message for each worker goroutine?
If something has been successfully sent into the channel then no, it can't be lost in correctly working environment (I mean if you're tampering with your memory or you have bit flips due to cosmic rays then don't expect anything of course).
Message is successfully sent when ch <- x returns. Otherwise, if it panics, it's not really being sent and if you don't recover than you could claim it's lost (however, it would be lost due to application logic). Panic can happen if channel is closed or, say, you're out of memory.
Similarly if sender is putting into the channel in non-blocking mode (by using select), you should have a sufficient buffer in your channel, because messages can be "lost" (although somehow intentionally). For example signal.Notify is working this way:
Package signal will not block sending to c: the caller must ensure that c has sufficient buffer space to keep up with the expected signal rate.
No, they can't be lost.
While the language spec does not in any way impose any particular implementation on channels, you can think of them as semaphores protecting either a single value (for the single message) or an array/list of them (for buffered channels).
The semantics are then enforced in such a way that as soon as a goroutine wants to send a message to a channel, it tries to acquire a free data slot using that semaphore, and then either succeeds at sending—there's a free slot for its message—or blocks—when there isn't. As soon as such a slot appears—someone has received an existing message—the sending succeeds and the sending goroutine gets unblocked.
This is a simplified explanation. In other words, channels in Go is not like message queues which usually are happy with losing messages.
On a side note, I'm not really sure what happens if the receiver panics in some specific state when it's about to receive your message. In other words, I'm not sure whether Go guarantees that the message is either sent or not in the presence of a receiver panicking in an unfortunate moment.
Oh, and there's that grey area of the main goroutine exiting (that one running the main.main() function): the spec states clear than the main goroutine does not wait for any other goroutines to complete when it exits. So unless you somehow arrange for the synchronized controlled shutdown of all your spawned goroutines, I believe they may lose messages. On the other hand, in this case the world is collapsing anyway…
Message can not be lost. It can be not sent.Order of goroutines execution not defined. So your endless for loop can receive from only one worker all time, and even can sleep if it isn't in main thread. To be sure your queue works in regular fashion you better explicitly in 'main' receive messages for each worker.
I have a pipeline with goroutines connected by channels so that each goroutine will trigger another one until all have run. Put even simpler, imagine two goroutines A and B so that when A is done it should tell B it can run.
It's working fine and I have tried a few variants as I have learnt more about pipelines in Go.
Currently I have a signalling channel
ch := make(chan struct{})
go A(ch)
go B(ch)
...
that B blocks on
func B(ch <-chan struct{}) {
<-ch
...
and A closes when done
func A(ch chan struct{}) {
defer close(ch)
...
}
This works fine and I also tried, instead of closing, sending an empty struct struct{} in A().
Is there any difference between closing the channel or sending the empty struct? Is either way cheaper / faster / better?
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct? Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
I'm well aware of premature optimization. This is only to understand things, not to optimize anything.
Maybe there's an idiomatic Go way to do this even?
Thanks for any clarification on this!
closing a channel indicates that there will be no more sends on that channel. It's usually preferable, since you would get a panic after that point in the case of an inadvertent send or close (programming error). A close also can signal multiple receivers that there are no more messages, which you can't as easily coordinate by just sending sentinel values.
Naturally, sending any other type in the channel takes up "some" amount of memory, but how is it with the empty struct?
There's no guarantee that it does take any extra memory in an unbuffered channel (it's completely an implementation detail). The send blocks until the receive can proceed.
Close is just part of the channel so not "sent" as such even if information is passed between goroutines.
There's no optimization here, close is simply another type of message that can be sent to a channel.
Each construct has a clear meaning, and you should use the appropriate one.
Send a sentinel value if you need to signal one receiver, and keep the channel open to send more values.
Close the channel if this is the final message, possibly signal multiple receivers, and it would be an error to send or close again.
You can receive from closed channel by multiple goroutines and they will never block. It's a main advantage. It's one_to_many pattern.
finish := make(chan struct{}) can be used in many_to_one pattern when many concurrent runners want to report things done, and outsider will not panic.
It's not about memory consumption.
I am trying to write a queue and I'd need to "grow" my buffered chans, is there a way to do that without having to create a new one and moving the elements to the new one?
It is not possible with standard channels. However by using an intermediate goroutine with a few tricks you can make something that's effectively equivalent. It will, however, be somewhat slower than a native channel. This is implemented as the ResizableChannel in the channels package (disclaimer: I wrote it).
godoc: https://godoc.org/github.com/eapache/channels#ResizableChannel
github: https://github.com/eapache/channels/
Why would you want to grow the chan size? Are you looking to have a chan where you can keep writing regardless whether there are readers or not?
If so, you should use a goroutine which will own the queue and two chans (read chan and a write chan). The goroutine will keep a slice of items internaly with all the written items (received via write chan) and it will keep attempting to write to the read chan which will block till there are readers reading from it.
hope this helps
I understand from this question "Golang - What is channel buffer size?" that if the channel is buffered it won't block.
c := make(chan int, 1)
c <- data1 // doesn't block
c <- data2 // blocks until another goroutine receives from the channel
c <- data3
c <- data4
But I don't understand whats the use of it. Suppose if I have 2 goroutines, 1st one will received data1 and 2nd one receives data2 then it will block till any subroutines gets free to process data3.
I don't understand what difference did it make ? It would have executed the same way without buffer. Can you explain a possible scenario where buffering is useful ?
A buffered channel allows the goroutine that is adding data to the buffered channel to keep running and doing things, even if the goroutines reading from the channel are starting to fall behind a little bit.
For example, you might have one goroutine that is receiving HTTP requests and you want it to be as fast as possible. However you also want it to queue up some background job, like sending an email, which could take a while. So the HTTP goroutine just parses the user's request and quickly adds the background job to the buffered channel. The other goroutines will process it when they have time. If you get a sudden surge in HTTP requests, the users will not notice any slowness in the HTTP if your buffer is big enough.
This site has a good explanation:
https://www.openmymind.net/Introduction-To-Go-Buffered-Channels/